LogoLanguage
LambdaZen India (P) Ltd1

3rd Floor, Amstor Building, Technopark Campus, Trivandrum , 695581

Data Engineer

Closing Date:24,Oct 2025
Job Published: 10,Oct 2025

Brief Description

We are seeking an experienced Data Engineer with deep expertise in Azure Databricks to design, develop, and maintain high-performance data pipelines and analytical platforms. The ideal candidate has hands-on experience building scalable data solutions on Azure, integrating diverse data sources, and enabling data-driven insights across teams.

Key Responsibilities

  • Design, build, and optimize ETL/ELT pipelines using Azure Databricks (PySpark/SQL).

  • Develop and maintain Delta Lake-based data architectures for reliability, scalability, and performance.

  • Integrate data from multiple sources including APIs, streaming platforms, and on-prem databases into Azure Data Lake Storage (ADLS).

  • Manage data orchestration and workflow automation using Azure Data Factory (ADF) or Databricks Workflows.

  • Collaborate with analysts, data scientists, and application developers to ensure clean, high-quality, and well-documented data.

  • Implement and maintain data quality, validation, and monitoring frameworks within Databricks.

  • Optimize data pipelines for cost, speed, and reliability using Databricks clusters, caching, and partitioning.

  • Implement data governance, lineage, and access control in accordance with corporate security policies.

  • Support real-time or near real-time data streaming using Azure Event Hubs, Kafka, or Delta Live Tables.

  • Continuously evaluate new Azure and Databricks capabilities to enhance platform performance.

Preferred Skills

  • Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.

  • 3–6 years of hands-on experience in data engineering, with at least 2+ years in Azure Databricks.

  • Strong proficiency in PySpark, SQL, and data modeling.

  • Experience with Azure Data Lake Storage (ADLS), Azure Synapse, Azure Data Factory, and Azure Key Vault.

  • Deep understanding of Delta Lake architecture, ACID transactions, and schema evolution.

  • Experience implementing CI/CD for Databricks using GitHub Actions, Azure DevOps, or similar tools.

  • Strong knowledge of data partitioning, caching, and performance tuning in Spark environments.

  • Solid grasp of data governance and security best practices within Azure (e.g., RBAC, Managed Identities).