Job Description
Job DescriptionPurpose of the role
To apply software engineering techniques, automation, and best practices in incident response, to ensure the reliability, availability, and scalability of the systems, platforms, and technology through them.
Accountabilities
- Availability, performance, and scalability of systems and services through proactive monitoring, maintenance, and capacity planning.
- Resolution, analysis and response to system outages and disruptions, and implement measures to prevent similar incidents from recurring.
- Development of tools and scripts to automate operational processes, reducing manual workload, increasing efficiency, and improving system resilience.
- Monitoring and optimisation of system performance and resource usage, identify and address bottlenecks, and implement best practices for performance tuning.
- Collaboration with development teams to integrate best practices for reliability, scalability, and performance into the software development lifecycle, and work closely with other teams to ensure smooth and efficient operations.
- Stay informed of industry technology trends and innovations, and actively contribute to the organization's technology communities to foster a culture of technical excellence and growth.
Assistant Vice President Expectations
- Consult on complex issues; providing advice to People Leaders to support the resolution of escalated issues.
- Identify ways to mitigate risk and developing new policies/procedures in support of the control and governance agenda.
- Take ownership for managing risk and strengthening controls in relation to the work done.
- Perform work that is closely related to that of other areas, which requires understanding of how areas coordinate and contribute to the achievement of the objectives of the organisation sub-function.
- Collaborate with other areas of work, for business aligned support areas to keep up to speed with business activity and the business strategy.
- Engage in complex analysis of data from multiple sources of information, internal and external sources such as procedures and practises (in other areas, teams, companies, etc).to solve problems creatively and effectively.
- Communicate complex information. 'Complex' information could include sensitive information or information that is difficult to communicate because of its content or its audience.
- Influence or convince stakeholders to achieve outcomes.
All colleagues will be expected to demonstrate the Barclays Values of Respect, Integrity, Service, Excellence and Stewardship – our moral compass, helping us do what we believe is right. They will also be expected to demonstrate the Barclays Mindset – to Empower, Challenge and Drive – the operating manual for how we behave.
Join us as a Site Reliability Engineer at Barclays where you'll spearhead the evolution of our digital landscape, driving innovation and excellence. You'll harness cutting-edge technology to revolutionise our digital offerings, ensuring unapparelled customer experiences.
You may be assessed on the key critical skills relevant for success in role, such as experience with programming languages in linux administrator , scripting language and Automation tools Ansible as well as job-specific technical skills.
- Experience in Linux administration
- Proficiency in one or more programming languages (Python, Ruby, Bash, PowerShell, PHP, etc)
- Proficiency in unix + scripting
- Strong understanding of Automation Tools – Ansible, Terraform or equivalent
- Experience working with container and orchestration (Docker, Kubernetes).
- Hands-on experience with CI/CD pipelines and tools (Git, GitLab, Jenkins)
- Familiarity with monitoring tools – Prometheus, Grafana, InfluxDB, Kapacitor
- Good understanding of system architecture, networking, and performance optimization
- Experience working with REST API development and Integration, Database management / Query language.
This role is based in Pune.