Senior DevOps Engineer - Product Group
Job Description
Manage and maintain the production environment, which includes Kubernetes, Linux, Kafka, Elastic Stack, Oracle DB, and microservices built using Java Spring Boot and Angular. Automate the deployment of new application releases to the production environment. Implement and maintain robust backup and restore procedures to ensure data protection and fast recovery. Monitor the production environment, set up alerts, and investigate and resolve issues to maintain high availability and performance. Perform regular health checks and proactively identify and address potential problems. Collaborate with development teams to ensure smooth integration of new features and services. Optimize infrastructure and application performance, identify bottlenecks, and implement solutions to improve efficiency. Participate in on-call rotations and respond to incidents with a sense of urgency. Document processes, create runbooks, and share knowledge with the team. Continuously research and implement new technologies and best practices to enhance the platform.
Personal Skills
Experience in a DevOps or Site Reliability Engineering role. Proficient in managing Kubernetes, Linux, Kafka, Elastic Stack, and Oracle DB environments. Extensive experience in deploying, monitoring, and maintaining production-grade microservices built with Java Spring Boot and Angular. Strong understanding of software development lifecycle, CI/CD, and infrastructure as code. Hands-on experience with automation tools, such as Ansible, Terraform, or AWS Cloud Formation. Familiarity with monitoring and observability tools, like Prometheus, Grafana, and Elasticsearch. Excellent troubleshooting and problem-solving skills, with the ability to identify and resolve complex issues. Strong written and verbal communication skills to collaborate effectively with cross-functional teams. Demonstrated ability to work in a fast-paced, agile environment and prioritize tasks based on business needs. Experience with cloud platforms, such as AWS, Google Cloud, or Azure. Knowledge of container orchestration and management platforms. Familiarity with security best practices and compliance requirements. Certifications in relevant technologies (e.g., Kubernetes, AWS, Google Cloud).
Technical Skills
Experience in a DevOps or Site Reliability Engineering role.
Proficient in managing Kubernetes, Linux, Kafka, Elastic Stack, and Oracle DB environments.
Extensive experience in deploying, monitoring, and maintaining production-grade microservices built with Java Spring Boot and Angular.
Strong understanding of software development lifecycle, CI/CD, and infrastructure as code.
Hands-on experience with automation tools, such as Ansible, Terraform, or AWS Cloud Formation.
Familiarity with monitoring and observability tools, like Prometheus, Grafana, and Elasticsearch.
Excellent troubleshooting and problem-solving skills, with the ability to identify and resolve complex issues.
Strong written and verbal communication skills to collaborate effectively with cross-functional teams.
Demonstrated ability to work in a fast-paced, agile environment and prioritize tasks based on business needs.
Experience with cloud platforms, such as AWS, Google Cloud, or Azure.
Knowledge of container orchestration and management platforms.
Familiarity with security best practices and compliance requirements.
Certifications in relevant technologies (e.g., Kubernetes, AWS, Google Cloud).
Education
Bachelor's degree in Computer Science, Software Engineering, or a related field.
Job Location Cairo, Egypt Job Role Information Technology Years of Experience Min: 5 Max: 9