Job Description
Job Title: Data
Engineer
Job Summary:
Are you
passionate about building scalable data pipelines, optimizing ETL processes,
and designing efficient data models? We are looking for a Databricks Data
Engineer to join our team and play a key role in managing and transforming data
in Azure cloud environments.
In this
role, you will work with Azure Data Factory (ADF), Databricks, Python, and SQL
to develop robust data ingestion and transformation workflows. You’ll also be
responsible for integrating SAP IS-Auto data, optimizing performance, and
ensuring data quality & governance.
If you have
strong experience in big data processing, distributed computing (Spark), and
data modeling, we’d love to hear from you!
Key
Responsibilities:
- Develop & Optimize ETL Pipelines:
- Build robust and scalable data pipelines using ADF, Databricks, and Python for data ingestion, transformation, and loading.
Data Modeling & Systematic Layer Modeling:- Design logical, physical, and systematic data models for structured and unstructured data.
Integrate SAP IS-Auto:- Extract, transform, and load data from SAP IS-Auto into Azure-based data platforms.
Database Management:- Develop and optimize SQL queries, stored procedures, and indexing strategies to enhance performance.
Big Data Processing:- Work with Azure Databricks for distributed computing, Spark for large-scale processing, and Delta Lake for optimized storage.
Data Quality & Governance:- Implement data validation, lineage tracking, and security measures for high-quality, compliant data.
Collaboration:- Work closely with business analysts, data scientists, and DevOps teams to ensure data availability and usability.
Testing and Debugging:- Write unit tests and perform debugging to ensure the Implementation is robust and error-free.
- Conduct performance optimization and security audits.
Required
Skills and Qualifications:
- Azure Cloud Expertise:
- Strong experience in Azure Data Factory (ADF), Databricks, and Azure Synapse.
Programming:- Proficiency in Python for data processing, automation, and scripting.
SQL & Database Skills:- Advanced knowledge of SQL, T-SQL, or PL/SQL for data manipulation.
SAP IS-Auto Data Handling:- Experience integrating SAP IS-Auto as a data source into data pipelines.
Data Modeling:- Hands-on experience in dimensional modeling, systematic layer modeling, and entity-relationship modeling.
Big Data Frameworks:- Strong understanding of Apache Spark, Delta Lake, and distributed computing.
Performance Optimization:- Expertise in query optimization, indexing, and performance tuning.
Data Governance & Security:- Knowledge of RBAC, encryption, and data privacy standards.
Preferred
Qualifications:
1.Experience with CI/CD for data
pipelines using Azure DevOps.
2.Knowledge of Kafka/Event Hub for
real-time data processing.
3.Experience with Power BI/Tableau
for data visualization (not mandatory but a plus).