Overview We are PepsiCo PepsiCo is one of the world's leading food and beverage companies with more than $79 Billion in Net Revenue and a global portfolio of diverse and beloved brands. We have a complementary food and beverage portfolio that includes 22 brands that each generate more than $1 Billion in annual retail sales. PepsiCo's products are sold in more than 200 countries and territories around the world. PepsiCo's strength is its people. We are over 250,000 game changers, mountain movers and history makers, located around the world, and united by a shared set of values and goals. We believe that acting ethically and responsibly is not only the right thing to do, but also the right thing to do for our business. At PepsiCo, we aim to deliver top-tier financial performance over the long term by integrating sustainability into our business strategy, leaving a positive imprint on society and the environment. We call this Winning with Purpose. For more information on PepsiCo and the opportunities it holds, visit www.pepsico.com. What Data Engineers in PepsiCo Data and AI team do: Maintain a predictable, transparent, global operating rhythm that ensures always-on access to high-quality data for stakeholders across the company Responsible for day-to-day data extraction, load and transformation of PepsiCo’s corporate data assets Work cross-functionally across the enterprise to centralize data and standardize it for use by business, data science or other stakeholders Increase awareness about available data and democratize access to it across the company Senior Data Engineer: As a Senior Data Engineer, you will be the key technical expert developing and overseeing PepsiCo's data product build & operations and drive a strong vision for how data engineering can proactively create a positive impact on the business. You will be an empowered member of a team of data engineers who build data pipelines, ingest data into the PepsiCo Data Lake, and enable exploration and access for analytics, visualization, machine learning, and product development efforts across the company. Responsibilities Active contributor to code development in projects and services Manage and scale data pipelines from internal and external data sources to support new product launches and drive data quality across data products Build and own the automation and monitoring frameworks that captures metrics and operational KPIs for data pipeline quality and performance Responsible for implementing best practices around systems integration, security, performance and data management Empower the business by creating value through the increased adoption of data, data science and business intelligence landscape Collaborate with internal clients (data science and product teams) to drive solutioning and PoC discussions Evolve the architectural capabilities and maturity of the data platform by engaging with enterprise architects and strategic internal and external partners Define and manage SLAs for data products and processes running in production Support large-scale experimentation done by data scientists Develop and optimize procedures to “productionalize” data science models Prototype new approaches and build solutions at scale Research in state-of-the-art methodologies Create documentation for learnings and knowledge transfer Create and audit reusable packages or libraries Preferred exp working with Master data. Qualifications 10+ years of overall technology experience that includes at least 4+ years of hands-on software development, data engineering, and systems architecture 4+ years of experience with Data Lake Infrastructure, Data Warehousing, and Data Analytics tools 4+ years of experience in SQL optimization and performance tuning, and development experience in programming languages like Python, PySpark etc.) 2+ years in cloud data engineering experience in Azure Experience with Azure Data Factory, Azure Databricks and Azure Machine learning tools Fluent with Azure cloud services. Azure Certification is a plus Experience with integration of multi cloud services with on-premises technologies Experience with data modeling, data warehousing, and building high-volume ETL/ELT pipelines Experience with data profiling and data quality tools like Apache Griffin, Deequ, and Great Expectations Experience building/operating highly available, distributed systems of data extraction, ingestion, and processing of large data sets Experience with at least one MPP database technology such as Databricks, Redshift or SnowFlake Experience with running and scaling applications on the cloud infrastructure and containerized services like Kubernetes Experience with version control systems like Github and deployment & CI tools Proven track record of leading, mentoring data teams.