Key Responsibilities
•Contribute to the development of scalable and performant data pipelines on Databricks, leveraging Delta Lake, Delta Live Tables (DLT), and other core Databricks components.
•Develop data lakes/warehouses designed for optimized storage, querying, and real-time updates using Delta Lake.
•Implement effective data ingestion strategies from various sources (streaming, batch, API-based), ensuring seamless integration with Databricks.
•Ensure the integrity, security, quality, and governance of data across our Databricks-centric platforms.
•Collaborate with stakeholders (data scientists, analysts, product teams) to translate business requirements into Databricks-native data solutions.
•Build and maintain ETL/ELT processes, heavily utilizing Databricks, Spark (Scala or Python), SQL, and Delta Lake for transformations.
Page
•Experience with CI/CD and DevOps practices specifically tailored for the Databricks environment.
•Monitor and optimize the cost-efficiency of data operations on Databricks, ensuring optimal resource utilization.
•Utilize a range of Databricks tools, including the Databricks CLI and REST API, alongside Apache Spark™, to develop, manage, and optimize data engineering solutions.
Key Relationships
•Internal – Data Engineering Manager
•Developers across various departments, Managers of Departments in other regional hubs of Puma Energy
•External – Platform providers