Job Description
About HasuraHasura is a venture-backed open-source technology company that makes data instantly accessible over a realtime API, enabling developers to build and ship modern apps and APIs faster. Our flagship product, Hasura DDN (Data Delivery Network), is a globally distributed and always-available network of API and data connectivity servers for blazing-fast and secure delivery of realtime data over GraphQL or REST API. Upcoming: In addition to our API access layer, our AI lab will be releasing agentic data access as a next-gen capability on Hasura DDN - for AI applications.About the RoleWe are seeking an experienced Engineering Manager to lead our Platform Team, which serves as the backbone of Hasura's cloud infrastructure and reliability efforts. As the Engineering Manager of the Platform Team, you will be responsible for managing a global team of Site Reliability Engineers (SREs) and Infrastructure/Platform engineers. Your primary focus will be ensuring the reliability, scalability, and performance of our cloud systems while providing essential platform services and tooling to other engineering teams.Engineering organization at Hasura operates with a great deal of product ownership, each team closely aligned with top level business objectives. In this position, you will be an important partner to drive some key business and product goals. You will work on architecting complex infrastructure features on the core product and helping execute the vision of making data access 10x easier by building easy to use, planet-scale, low-latency, reliable cloud services.What the role will involve:Solving hard problems: Architect solutions for complex problems both independently and collaboratively, both at the low-level and high-level, ensuring scalability, maintainability, and performance.Product thinking: Understand ambiguous or loose customer (i.e. developers and enterprises) requirements and formulate solutions which align strategically with the product.Mentorship: Provide guidance and mentorship to team members during technical problem solving and code reviews.Implementing engineering best practices: Identify opportunities to improve engineering best practices to improve overall software production and quality.Collaborating with stakeholders: Foster strong collaboration and communication with key stakeholders across the organization, including fellow engineers, managers, and executives.Key ResponsibilitiesTeam Leadership: Lead, mentor, and grow a distributed team of SREs and Platform engineers, fostering a culture of continuous improvement and innovation.Reliability Engineering: Implement and maintain SRE best practices, including error budgets, SLOs, and SLIs, to ensure the highest levels of system reliability and performance.Infrastructure Management: Oversee the design, implementation, and maintenance of Hasura DDN's infrastructure, ensuring it meets the demands of a globally distributed, high-performance system.Scalability and Performance: Drive initiatives to continuously improve the scalability and performance of our microservices architecture, implementing best practices for distributed systems.Incident Management: Establish and refine incident response processes, including post-mortem analyses and implementation of lessons learned.Automation and Tooling: Lead efforts to develop and improve automation tools and processes that enhance the efficiency of both the Platform team and other engineering teams.Cross-team Collaboration: Work closely with other engineering teams to understand their platform needs and deliver solutions that accelerate their development processes.Capacity Planning: Develop and maintain capacity models to ensure our infrastructure can handle growth and peak loads efficiently.Security and Compliance: Collaborate with the security team to implement and maintain robust security practices across our infrastructure.Monitoring and Observability: Oversee the implementation and improvement of monitoring and observability solutions to provide real-time insights into system health and performance.Cost Optimization: Continuously analyze and optimize infrastructure costs while maintaining high performance and reliability standards.Technical Strategy: Contribute to the overall technical strategy of Hasura, particularly in areas related to infrastructure, reliability, and scalability.Requirements5+ years of software engineering experience, with at least 2 years in a leadership role managing SRE or Platform engineering teamsDeep understanding of cloud infrastructure, preferably with hands-on experience in AWS, GCP, or AzureStrong knowledge of containerization and orchestration technologies, particularly Kubernetes