Design, develop, and maintain data pipelines for extracting, transforming, and loading (ETL) data from various sources.
Assist in building and optimizing data warehouse architectures to ensure efficient data processing and storage.
Collaborate with data scientists, analysts, and software engineers to support data-driven solutions.
Monitor, troubleshoot, and improve data workflows for reliability, performance, and scalability.
Implement data governance best practices, including data quality, security, and compliance.
Work with cloud-based data platforms (e.g., AWS, Azure, GCP) to build scalable data infrastructure.
Write efficient SQL queries and scripts for data extraction, transformation, and analysis.
Maintain thorough documentation of data architecture, pipelines, and processes.
Qualification
Bachelor’s degree in Computer Science, Information Technology, Data Science, or a related field.
Internship or academic project experience in data engineering, database management, or software development is a plus.
Required Skills
Basic knowledge of programming languages such as Python, SQL, or Java.
Understanding of database management systems (SQL/NoSQL) and data modeling concepts.
Familiarity with ETL/ELT processes and data pipeline development.
Exposure to cloud platforms like AWS, Azure, or GCP (preferred).
Basic understanding of big data technologies (Hadoop, Spark) is a plus.
Strong problem-solving skills and attention to detail.
Ability to work collaboratively in a team environment.
Willingness to learn and adapt to new technologies and best practices.