Azure Data Engineer (Lead)

Infogain

2 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Senior

Job location

Tech stack

Java

Artificial Intelligence

Airflow

Amazon Web Services (AWS)

Data analysis

Azure

Cloud Computing

Profiling

Data Validation

Information Engineering

Data Governance

Data Profiling

Data Warehousing

Github

Python

Prometheus

Software Engineering

SQL Databases

Web Platforms

Web Services

Datadog

Google Cloud Platform

Cloud Platform System

Azure

React

Grafana

Spring-boot

GIT

Data Lake

PySpark

Collibra

Data Management

Software Version Control

Data Pipelines

Databricks

Microservices

Job description

Azure Data Engineer (Senior) with skills Data Engineering, Python, Databricks, SQL, Azure Data Factory for location Bangalore, India, 1. Data Quality Development & Monitoring

Design and implement automated data quality rules and validation checks using Databricks (Delta Lake) and PySpark.
Build and operationalize data quality workflows in Ataccama ONE / Ataccama Studio.
Perform data profiling, anomaly detection, and reconciliation across systems and data sources.
Establish thresholds, KPIs, and alerts for data quality metrics.

Root Cause Analysis & Issue Management

Investigate data anomalies and quality incidents using SQL, Python, and Ataccama diagnostics.
Collaborate with data engineers and business analysts to identify and remediate root causes.
Document recurring data issues and contribute to preventive automation solutions.

Collaboration & Governance Support

Partner with data stewards, governance, and analytics teams to define and maintain DQ rules and SLAs.
Contribute to metadata enrichment, lineage documentation, and data catalog integration.
Support adoption of DQ frameworks and promote data reliability best practices.

Automation & Continuous Improvement

Integrate DQ validations into orchestration tools (Airflow, Databricks Workflows, or ADF).
Leverage Python/Pyspark libraries to complement existing platforms.
Propose process improvements to enhance automation, monitoring, and exception management.

Requirements

Data Engineering & Quality

Databricks (Delta Lake), PySpark, SQL, Python

DQ Platforms

Ataccama ONE / Studio (DQ rules, workflows, profiling)

Orchestration

Apache Airflow, Azure Data Factory, or Databricks Jobs

Data Warehouses

Databricks Lakehouse

Version Control / CI-CD

Git, GitHub Actions, Azure DevOps

Data Catalog / Lineage (Optional)

Collibra, Alation, Ataccama Catalog

Cloud Environments

Azure (preferred), AWS, or GCP, * Bachelor's degree in Computer Science, Information Systems, Statistics, or related field.

6-9 years of experience in data quality, data engineering, or analytics operations.
Strong command of SQL, Python, and PySpark for data validation and troubleshooting.
Proven experience with Ataccama DQ rule creation and monitoring.
Hands-on exposure to Databricks for building and running data pipelines.
Working knowledge of reconciliation processes, data profiling, and DQ metrics., * Analytical thinker with strong problem-solving abilities.
Detail-oriented and methodical approach to troubleshooting.
Strong communication skills for cross-functional collaboration.
Proactive mindset, capable of owning issues through resolution.
Comfortable balancing hands-on technical work with business stakeholder interaction., * Exposure to data governance frameworks or MDM initiatives.
Familiarity with observability tools (Grafana, Datadog, Prometheus).
Understanding of CI/CD practices for data quality deployment.
Certification in Databricks, Ataccama, or a major cloud platform (Azure/AWS)., * Increase in automated data quality coverage across critical datasets.
Reduction in recurring manual DQ exceptions.
Improved timeliness and accuracy of data available for analytics.
Positive stakeholder feedback on data trust and reliability.

EXPERIENCE

6-8 Years

SKILLS

Primary Skill: Data Engineering
Sub Skill(s): Data Engineering
Additional Skill(s): Python, Databricks, SQL, Azure Data Factory, Experience in Data Engineering Additional Skill(s) Python Databricks SQL

About the company

Infogain is a human-centered digital platform and software engineering company based out of Silicon Valley. We engineer business outcomes for Fortune 500 companies and digital natives in the technology, healthcare, insurance, travel, telecom, and retail & CPG industries using technologies such as cloud, microservices, automation, IoT, and artificial intelligence. We accelerate experience-led transformation in the delivery of digital platforms. Infogain is also a Microsoft (NASDAQ: MSFT) Gold Partner and Azure Expert Managed Services Provider (MSP). Infogain, an Apax Funds portfolio company, has offices in California, Washington, Texas, the UK, the UAE, and Singapore, with delivery centers in Seattle, Houston, Austin, Kraków, Noida, Gurgaon, Mumbai, Pune, and Bengaluru.