Overview We are seeking a seasoned Site Reliability Engineer (SRE) to join our Network Operations Command Centre. You will play a pivotal role in ensuring the reliability, performance, and scalability of our critical network infrastructure. The ideal candidate will have a strong background in network engineering, automation, and incident response, combined with a passion for building and maintaining highly available systems Responsibilities Monitor and Respond: Proactively monitor network health and performance metrics, identify potential issues, and respond swiftly to incidents. Automate and Optimize: Develop and implement automation tools/scripts and workflows to streamline network operations, improve efficiency, and reduce manual intervention. Collaborate and Troubleshoot: Work closely with network engineering, security, and other teams to troubleshoot complex network problems and implement solutions. Capacity Planning: Analyze network traffic patterns and usage trends to forecast capacity needs and ensure sufficient resources are available to meet demand. Document and Communicate: Maintain accurate and up-to-date documentation of network configurations, procedures, and incidents. Participate in On-Call Rotation: Provide 24/7 on-call support for critical network incidents. Qualifications Experience: 9 + years of experience in network engineering or operations, with a focus on SRE principles and practices. Technical Skills: Deep understanding of network protocols (TCP/IP, BGP, OSPF, etc.) Experience with network monitoring and management tools such as ThousandEyes, NetBrain, and SevOne Proficiency in automation frameworks Problem-Solving: Strong analytical and problem-solving skills, with the ability to identify and resolve complex network issues. Communication: Excellent written and verbal communication skills, with the ability to collaborate effectively with cross-functional teams. Adaptability: Ability to thrive in a fast-paced, dynamic environment, and adapt to changing priorities