https://bayt.page.link/3K32D6dUfrg6Gomi9
Back to the job results

Senior Site Reliability Engineer

Today 2025/07/17
50-99 Employees · Other Business Support Services
Create a job alert for similar positions

Job Description

Role Purpose:

Site Reliability Engineers (SREs) are responsible for keeping all user-facing services and other production systems running smoothly. SREs are a blend of pragmatic operators and software craftspeople that apply sound engineering principles, operational discipline, and mature automation to our environments and the codebase. Specialize in systems, whether it be networking, the Linux kernel, or some more specific interest in scaling, algorithms, or distributed systems.




Key Accountabilities:


Operations:


•    Maintaining the availability of the applications.
•    Use your on-call shift to prevent incidents from ever happening.
•    Run our infrastructure with Chef, Terraform and Kubernetes. 
•    Make monitoring and alerting alert on symptoms and not on outages. On CloudWatch, Splunk, AppDynamics, Dynatrace, Graphana, Kibana and DataDog and Prometheus.
•    Document every action so your findings turn into repeatable actions–and then into automation. 
•    Excellent analytics skills to improve the product as much as possible. 
•    Improve the deployment process to make it as boring as possible. 
•    Design, build and maintain core infrastructure pieces that allow services scaling to support hundreds of thousands of concurrent users. 
•    Debug production issues across services and levels of the stack. 
•    Plan the growth of the infrastructure.
•    Think about systems - edge cases, failure modes, behaviors, specific implementations.
•    Performing all possible technical troubleshooting steps on mobile platforms and back end systems to solve the problems and service requests, including testing and QA of fixes. 
•    Determine if issues are application or software (triage)
•    Regular Deployment activities for CMS enhancements and new releases
•    Developing scripts to automate system activities 
•    Monitoring configuration
•    User management configuration 
•    CMS (application) Management and configuration
•    Troubleshooting & facilitating the Infrastructure and connectivity issues remotely for Vodafone markets (i.e., Vodafone Portugal, Greece, Ghana, etc) (DNS,Proxy, Firewalls)
•    Utilizing the existing team applications and tools (knowledge base, Documentation database) to solve the incidents 


Quality:



•    Ensure that the technical solution and root cause are clear and logic
•    Prioritizing and managing several open cases at one time based on agreed Service Levels. 
•    Ensure that all incidents are solved within the agreed SLA
•    Update the technical documentation and the team knowledge base.



SLA:
   
•    Ensure that the technical solution and root cause are clear and logical
•    Prioritizing and managing several open cases at one time based on agreed Service Levels. 
•    Ensure that all incidents are solved within the agreed SLA
•    Update the technical documentation and the team knowledge base
 




PERSON SPECIFICATION:

PERSON SPECIFICATION



•    Qualification / Experience:


o    Bachelor of Engineering / computer science or related discipline
o    1-2  years IT Support, system administration, web/software development experience 


•    Technical skills:

o    UNIX/Linux administration
o    Security techniques and technologies
o    Good knowledge with AWS Services (EC2 , S3 , API Gateway , Route 53 , Cloudwatch, Cloudtrail)
o    Good knowledge in different DevOps tools such as Jenkins, Ansible.
o    Strong knowledge in Nginx, HAProxy, Docker, Kubernetes, Terraform, or similar technologies.
o    Understanding and experience with Cloud based systems such as PaaS, Containerization & Microservices.
o    Development skills using (shell scripting, PERL, Python, Ruby On Rails, Java, JavaScript) would be desirable.
o    Working knowledge of monitoring, alerting systems and log streaming (NewRelic, Nagios, AppDynamics, Dynatrace, Splunk and/or Kibana) 
o    Familiar with the configuration management systems like Chef or similar systems.
o    Ability to use GitLab
o    BMC Remedy experience is desirable
o    ITIL Foundation certification is desirable.
o    Ability to lead the investigation with third parties
o    Technical knowledge in IT infrastructure technologies  and connectivity techniques
o    Continuous integration
o    Workflow best Practice
o    Understanding and experience with Cloud based systems such as PaaS or AWS.
o    Development skills using (shell scripting, PERL,)
o    Hands on experience for Revision control systems like SVN,GIT
o    Working knowledge of monitoring and alerting systems (NewRelic, Nagios) 
o    Remedy experience  is desirable
o    ITIL Foundation certification is desirable.
o    Application deployment management and risk mitigation 
o    Knowledge of Application performance & monitoring tools such as  New Relic or AppDynamics would be desirable
o    Ability to collaborate with third parties
o    Understanding of the Cloud paradigm.


•    Personal skills:

o    Excellent communication skills and presentation skills.
o    Self motivated, detail-oriented 
o    Willing to work on shift basis and On-call.
o    Confident negotiation and online documentation (throughout the negotiation) in English 
o    Have an urge to document all the things so you don't need to learn the same thing twice.
o    Have an enthusiastic, go-for-it attitude. When you see something broken, you can't help but fix it.
o    Have an urge for delivering quickly and iterating fast.


•    Business and Managerial skills:

o    Good understanding of the global cooperative team environment
o    Understanding of the Telecommunications Market
o    Working under stress, Multitasking, Dynamic and  Customer oriented
 




#VOIS #BeUnrivalled #Createthefuture



You have reached your limit of 15 Job Alerts. To create a new Job Alert, delete one of your existing Job Alerts first.
Similar jobs alert created successfully. You can manage alerts in settings.
Similar jobs alert disabled successfully. You can manage alerts in settings.