We are seeking an experienced Data Engineer with deep expertise in Azure Databricks to design, develop, and maintain high-performance data pipelines and analytical platforms. The ideal candidate has hands-on experience building scalable data solutions on Azure, integrating diverse data sources, and enabling data-driven insights across teams.
Key Responsibilities
-
Design, build, and optimize ETL/ELT pipelines using Azure Databricks (PySpark/SQL).
-
Develop and maintain Delta Lake-based data architectures for reliability, scalability, and performance.
-
Integrate data from multiple sources including APIs, streaming platforms, and on-prem databases into Azure Data Lake Storage (ADLS).
-
Manage data orchestration and workflow automation using Azure Data Factory (ADF) or Databricks Workflows.
-
Collaborate with analysts, data scientists, and application developers to ensure clean, high-quality, and well-documented data.
-
Implement and maintain data quality, validation, and monitoring frameworks within Databricks.
-
Optimize data pipelines for cost, speed, and reliability using Databricks clusters, caching, and partitioning.
-
Implement data governance, lineage, and access control in accordance with corporate security policies.
-
Support real-time or near real-time data streaming using Azure Event Hubs, Kafka, or Delta Live Tables.
-
Continuously evaluate new Azure and Databricks capabilities to enhance platform performance.