Data Engineer

Closing Date:24,Oct 2025

Job Published: 10,Oct 2025

Contact Email: sreesankar@lambdazen.com

Brief Description

We are seeking an experienced Data Engineer with deep expertise in Azure Databricks to design, develop, and maintain high-performance data pipelines and analytical platforms. The ideal candidate has hands-on experience building scalable data solutions on Azure, integrating diverse data sources, and enabling data-driven insights across teams.

Key Responsibilities

Design, build, and optimize ETL/ELT pipelines using Azure Databricks (PySpark/SQL).
Develop and maintain Delta Lake-based data architectures for reliability, scalability, and performance.
Integrate data from multiple sources including APIs, streaming platforms, and on-prem databases into Azure Data Lake Storage (ADLS).
Manage data orchestration and workflow automation using Azure Data Factory (ADF) or Databricks Workflows.
Collaborate with analysts, data scientists, and application developers to ensure clean, high-quality, and well-documented data.
Implement and maintain data quality, validation, and monitoring frameworks within Databricks.
Optimize data pipelines for cost, speed, and reliability using Databricks clusters, caching, and partitioning.
Implement data governance, lineage, and access control in accordance with corporate security policies.
Support real-time or near real-time data streaming using Azure Event Hubs, Kafka, or Delta Live Tables.
Continuously evaluate new Azure and Databricks capabilities to enhance platform performance.

Preferred Skills

Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
3–6 years of hands-on experience in data engineering, with at least 2+ years in Azure Databricks.
Strong proficiency in PySpark, SQL, and data modeling.
Experience with Azure Data Lake Storage (ADLS), Azure Synapse, Azure Data Factory, and Azure Key Vault.
Deep understanding of Delta Lake architecture, ACID transactions, and schema evolution.
Experience implementing CI/CD for Databricks using GitHub Actions, Azure DevOps, or similar tools.
Strong knowledge of data partitioning, caching, and performance tuning in Spark environments.
Solid grasp of data governance and security best practices within Azure (e.g., RBAC, Managed Identities).