https://bayt.page.link/h6XgXSTTdheukJ13A
العودة إلى نتائج البحث‎

Data Engineer-Python,PySpark,SQL ,Spark Architecture,Azure Databricks

اليوم 2025/06/25
خدمات الدعم التجاري الأخرى
أنشئ تنبيهًا وظيفيًا للوظائف المشابهة

الوصف الوظيفي

As a Data Engineer, you are required to:


 Design, build, and maintain data pipelines that efficiently process and transport data from various sources to storage systems or processing environments while ensuring data integrity, consistency, and accuracy across the entire data pipeline. 


Integrate data from different systems, often involving data cleaning, transformation (ETL), and validation. Design the structure of databases and data storage systems, including the design of schemas, tables, and relationships between datasets to enable efficient querying. Work closely with data scientists, analysts, and other stakeholders to understand their data needs and ensure that the data is structured in a way that makes it accessible and usable. 


Stay up-to-date with the latest trends and technologies in the data engineering space, such as new data storage solutions, processing frameworks, and cloud technologies. Evaluate and implement new tools to improve data engineering processes.


 

Qualification: Bachelor's or Master's in Computer Science & Engineering, or equivalent. Professional Degree in Data Science, Engineering is desirable.


  

Experience level:  At least 3 - 5 years hands-on experience in Data Engineering, ETL. 


 

Desired Knowledge & Experience:


  • Spark: Spark 3.x, RDD/DataFrames/SQL, Batch/Structured Streaming
    • Knowing Spark internals: Catalyst/Tungsten/Photon
  • Databricks: Workflows, SQL Warehouses/Endpoints, DLT, Pipelines, Unity, Autoloader
  • IDE: IntelliJ/Pycharm, Git, Azure Devops, Github Copilot
  • Test: pytest, Great Expectations
  • CI/CD Yaml Azure Pipelines, Continuous Delivery, Acceptance Testing
  • Big Data Design: Lakehouse/Medallion Architecture, Parquet/Delta, Partitioning, Distribution, Data Skew, Compaction
  • Languages: Python/Functional Programming (FP)
  • SQL: TSQL/Spark SQL/HiveQL
  • Storage: Data Lake and Big Data Storage Design

additionally it is helpful to know basics of:


  • Data Pipelines: ADF/Synapse Pipelines/Oozie/Airflow
  • Languages: Scala, Java
  • NoSQL: Cosmos, Mongo, Cassandra
  • Cubes: SSAS (ROLAP, HOLAP, MOLAP), AAS, Tabular Model
  • SQL Server: TSQL, Stored Procedures
  • Hadoop: HDInsight/MapReduce/HDFS/YARN/Oozie/Hive/HBase/Ambari/Ranger/Atlas/Kafka
  • Data Catalog: Azure Purview, Apache Atlas, Informatica
 

Required Soft skills & Other Capabilities:


 Great attention to detail and good analytical abilities.


 Good planning and organizational skills


 Collaborative approach to sharing ideas and finding solutions


 Ability to work independently and also in a global team environment.


 

لقد تجاوزت الحد الأقصى لعدد التنبيهات الوظيفية المسموح بإضافتها والذي يبلغ 15. يرجى حذف إحدى التنبيهات الوظيفية الحالية لإضافة تنبيه جديد
تم إنشاء تنبيه للوظائف المماثلة بنجاح. يمكنك إدارة التنبيهات عبر الذهاب إلى الإعدادات.
تم إلغاء تفعيل تنبيه الوظائف المماثلة بنجاح. يمكنك إدارة التنبيهات عبر الذهاب إلى الإعدادات.