Role purpose:
To support Vodafone’s key strategic growth areas in DevOps transformation, Vodafone Group Business Technology requires a highly capable and experienced Site Reliability Engineers to increase the DevOps platform’s reliability and performance. As a Site Reliability Engineer, you will be responsible of DevOps solutions deployment, stability, and operational improvement. You will operate different applications, services, features, modules and take the lead in deploying, reviewing, and troubleshooting of technical integration interfaces with various technology platforms & devices. You will be focusing on driving high reliability into systems by working closely with development and product Architect teams. You will be responsible of creating software that improves the reliability of systems in production, fixing issues, responding to incidents and problems. You will work along the Dev team in a DevOps model, sharing a common vision and objectives, bringing operative concerns early in the lifecycle.
You will be operating in an agile environment, challenging architectures, reviewing HLDs & LLDs, aware of new product features and actively contribute to the Product backlog in the form of “Operations by Design” requirements. You will become a key part of major ongoing and upcoming IoT success story.
Key accountabilities and decision ownership:
• Always consider and implement different aspects of continuous integration & deployment to ensure high level of Ops automation
• Partner with development teams to improve services through rigorous testing and release procedures
• Create sustainable systems and services through automation and uplifts
• Run the production environment by proactive monitoring availability and taking a holistic view of system health
• Develop tools/scripts to automate operational tasks to achieve zero touch operation
• Improve reliability, quality, and time-to-market of software solutions
• Support and deploy software packages as per Release commitments and ensuring smooth transition to production
• Responsible for Incident management to fix, diagnose and resolve software problems quickly and efficiently as per agreed SLAs.
• Responsible for Problem management to identify and manage RCA of incidents and implement preventive actions.
• Consistent knowledge and documentation sharing among the team.
Core competencies, knowledge, and experience:
• Good understanding of CICD technologies & architecture.
• Excellent problem-solving, troubleshooting, analytical and debugging skills.
• Strong skills in Platforms’ integration, software development for Cloud hosted technology
• Strong interest to work in an international, fast-moving, agile & cross functioning environment
• Excellent communication & presentation skills in English (both written and spoken)
• Proven track record of software support with DevOps & CI/CD tools.
• Development & programming background
Must have technical / professional qualifications:
• +3 years of proven hands-on experience in integrating and supporting end-to-end enterprise solutions.
• Experienced in Linux administration and hands-on experience with shell scripting.
• Good knowledge in CI/CD tools & technologies.
• Good experience in IaaC tools such as Terraform, cloud formation, etc.
• Good knowledge in infrastructure deployment, management, and operations on public cloud providers (AWS or Azure or GCP)
• Basic knowledge of AI/ML
• Hands-on experience with NoSQL databases, such as MariaDB, Postgres, Keycloak, Redis, MongoDB, etc.
• Good understanding of the Mobile network architecture and protocols
• Good understanding for ITIL & Agile
• BSc or MSc level degree in Software Engineering, Computer Science or Telecommunications
.
Key performance indicators:
• Operational and Non-Functional Requirements.
• Performance tuning and operational improvements
• Root Cause Analysis