الوصف الوظيفي
Responsibilities: Design and Build Scalable Data Pipelines: Architect, develop, and maintain efficient, scalable, and secure data pipelines to handle large datasets across multiple data sources, ensuring reliability and performance. Cloud Platform Expertise: Utilize AWS and GCP services (e.g., Amazon S3, Redshift, BigQuery, Dataflow, Cloud Storage, Dataproc) to implement and optimize cloud-based data infrastructure. Data Integration: Integrate various data sources, both structured and unstructured, from internal systems and third-party providers, to enable cross-functional teams to access actionable insights. Collaboration: Work closely with data scientists, analysts, and other stakeholders to understand data requirements and provide solutions that drive insights for business intelligence, reporting, and analytics. Data Quality & Governance: Implement best practices for data quality, data lineage, and governance to ensure the accuracy and compliance of data across pipelines and systems. Optimization and Automation: Continuously optimize data workflows and automate processes to improve efficiency and reduce latency in data operations. Performance Tuning: Optimize data storage, retrieval, and processing performance on cloud platforms to ensure optimal cost and time efficiency. Security & Compliance: Ensure data privacy and security standards are maintained in alignment with company policies and industry regulations.Requirements: Experience: 5+ years of hands-on experience as a data engineer, with a focus on data engineering in cloud environments (AWS and GCP). Cloud Technologies: Deep expertise in using AWS (e.g., S3, Redshift, Lambda, Glue) and GCP (e.g., BigQuery, Dataflow, Cloud Storage) to build and manage data pipelines and infrastructure. Data Engineering Skills: Strong knowledge of ETL/ELT processes, data warehousing concepts, and distributed computing frameworks (e.g., Apache Spark, Hadoop, Airflow). Programming: Proficiency in Python, SQL, and other relevant programming languages for data engineering. Database Knowledge: Experience working with both relational and NoSQL databases (e.g., PostgreSQL, MySQL, DynamoDB, MongoDB). Version Control & CI/CD: Familiarity with version control systems (e.g., Git) and CI/CD pipelines for automating deployments and testing. Data Processing Frameworks: Experience with data processing and orchestration frameworks such as Apache Airflow, Apache Kafka, or similar technologies. Problem Solving: Strong analytical and troubleshooting skills with the ability to resolve complex data and system issues in a timely manner. Media Industry Knowledge (Preferred): Familiarity with data needs and challenges within the media industry (e.g., content analytics, user behavior analysis, and media streaming data).Preferred Qualifications: Certifications: AWS Certified Solutions Architect, Google Professional Data Engineer, or similar certifications in cloud technologies. Big Data Technologies: Experience with big data tools, Spark, or similar distributed data processing technologies. Machine Learning (Optional): Exposure to machine learning platforms or working with data science teams to enable ML models is a plus. Unfortunately, due to the high number of responses we receive we are unable to provide feedback to all applicants. If you have not been contacted within 5-7 days, please assume that at this stage your application has been unsuccessful.