Built and optimized Python-based data workflows and ingestion pipelines for US clients, ensuring timely availability of business-critical data. Automated recurring manual processes using Python scripts, reducing manual effort by 30% and improving operational efficiency. Designed and implemented data validation checks, increasing accuracy and reliability of downstream reports. Actively participated in code reviews and shared knowledge with team members from diverse backgrounds and functions.
Designed and automated data pipelines using Azure Data Factory, Azure Synapse, and Databricks, improving reporting efficiency and reducing manual intervention. Processed 5TB+ daily datasets with PySpark, achieving a 50% performance boost through partitioning, caching, and optimized joins. Implemented real-time monitoring with Azure Monitor, enabling proactive pipeline health checks and reducing downtime.