Overview We are PepsiCo PepsiCo is one of the world's leading food and beverage companies with more than $79 Billion in Net Revenue and a global portfolio of diverse and beloved brands. We have a complementary food and beverage portfolio that includes 22 brands that each generate more than $1 Billion in annual retail sales. PepsiCo's products are sold in more than 200 countries and territories around the world. PepsiCo's strength is its people. We are over 250,000 game changers, mountain movers and history makers, located around the world, and united by a shared set of values and goals. We believe that acting ethically and responsibly is not only the right thing to do, but also the right thing to do for our business. At PepsiCo, we aim to deliver top-tier financial performance over the long term by integrating sustainability into our business strategy, leaving a positive imprint on society and the environment. We call this Winning with Purpose. For more information on PepsiCo and the opportunities it holds, visit www.pepsico.com. Data Science Team works in developing Machine Learning (ML) and Artificial Intelligence (AI) projects. Specific scope of this role is to develop ML solution in support of ML/AI projects using big analytics toolsets in a CI/CD environment. Analytics toolsets may include DS tools/Spark/Databricks, and other technologies offered by Microsoft Azure or open-source toolsets. This role will also help automate the end-to-end cycle with Azure Machine Learning Services and Pipelines. You will be part of a collaborative interdisciplinary team around data, where you will be responsible of our continuous delivery of statistical/ML models. You will work closely with process owners, product owners and final business users. This will provide you the correct visibility and understanding of criticality of your developments. Responsibilities Delivery of key Advanced Analytics/Data Science projects within time and budget, particularly around DevOps/MLOps and Machine Learning models in scope Active contributor to code & development in projects and services Partner with data engineers to ensure data access for discovery and proper data is prepared for model consumption. Partner with ML engineers working on industrialization. Communicate with business stakeholders in the process of service design, training and knowledge transfer. Support large-scale experimentation and build data-driven models. Refine requirements into modelling problems. Influence product teams through data-based recommendations. Research in state-of-the-art methodologies. Create documentation for learnings and knowledge transfer. Create reusable packages or libraries. Ensure on time and on budget delivery which satisfies project requirements, while adhering to enterprise architecture standards Leverage big data technologies to help process data and build scaled data pipelines (batch to real time) Implement end-to-end ML lifecycle with Azure Machine Learning and Azure Pipelines Automate ML models deployments Work timing is from (11:30 AM-9:00 PM) Qualifications BE/B.Tech in Computer Science, Maths, technical fields. Overall 9+ years of experience working as a Data Scientist. 6+ years’ experience building solutions in the commercial or in the supply chain space. 6+ years working in a team to deliver production level analytic solutions. Fluent in git (version control). Understanding of Jenkins, Docker are a plus. Fluent in SQL syntaxis. 6+ years’ experience in Statistical/ML techniques to solve supervised (regression, classification) and unsupervised problems. 6+ years’ experience in developing business problem related statistical/ML modeling with industry tools with primary focus on Python or Pyspark development. Skills, Abilities, Knowledge: Data Science – Hands on experience and strong knowledge of building machine learning models – supervised and unsupervised models. Knowledge of Time series/Demand Forecast models is a plus Programming Skills – Hands-on experience in statistical programming languages like Python, Pyspark and database query languages like SQL Statistics – Good applied statistical skills, including knowledge of statistical tests, distributions, regression, maximum likelihood estimators Cloud (Azure) – Experience in Databricks and ADF is desirable Familiarity with Spark, Hive, Pig is an added advantage Business storytelling and communicating data insights in business consumable format. Fluent in one Visualization tool. Strong communications and organizational skills with the ability to deal with ambiguity while juggling multiple priorities Experience with Agile methodology for team work and analytics ‘product’ creation. Experience in Reinforcement Learning is a plus. Experience in Simulation and Optimization problems in any space is a plus. Experience with Bayesian methods is a plus.