Job Description
Position Overview: We are looking for a dedicated and skilled Site Reliability Engineer (SRE) to join our team at Programmers Force. As an SRE, you will be responsible for ensuring the reliability and performance of our applications and services through automation, best practices, and proactive monitoring. You will work closely with development teams to design, implement, and maintain reliability engineering solutions that enhance application performance and availability.Key Responsibilities: Implement and maintain monitoring, alerting, and incident response systems to ensure application reliability and performance. Develop and enhance infrastructure through automation tools, improving deployment pipelines and system usability. Partner with development teams to ensure design for reliability and operational efficiency. Troubleshoot and resolve complex production issues with a focus on root cause analysis. Continuously review system metrics and performance data to identify areas for improvement. Design and implement disaster recovery and failover solutions. Participate in on-call rotation and provide support for production systems. Contribute to the creation and optimization of operational documentation.