Amazon Lab126 is an inventive research and development company that designs and engineers high-profile consumer electronics. Lab126 began in 2004 as a subsidiary of Amazon.com, Inc., originally creating the best-selling Kindle family of products. Since then, Lab126 has produced innovative devices like Fire tablets, Fire TV, Amazon Echo and Amazon Echo Show.The Amazon Devices group delivers delightfully unique Amazon experiences, giving customers instant access to everything, digital or physical.
Key job responsibilities
As the Systems Development Manager, you will lead our critical cloud infrastructure solution that delivers. As our Systems Development Manager, you will have the following responsibilities:
Core Leadership:
• Deliver cloud infrastructure with high availability/reliability for tier1 cloud services that will serve millions of customers
• Spearhead a high-performing team of SREs and DevOps engineers
• Partner with Core Technology, Platform and Application engineering teams to deliver excellence
SRE Focus:
• Design and implement SLOs, SLIs, and error budgets
• Drive a data-driven approach to system reliability
• Establish robust incident management and response protocols
• Lead continuous improvement initiatives
• Reduce toil through automation and engineering excellence
Cloud & DevOps Excellence:
• Orchestrate CI/CD pipelines and deployment automation
• Optimize AWS cloud infrastructure using IaC principles
• Implement robust monitoring, logging, and alerting systems
• Manage cloud security and compliance requirements
Platform Optimization:
• Champion cost optimization
• Own capacity planning and resource utilization
• Ensure high availability and performance at scale
You'll be the driving force behind our cloud infrastructure, ensuring seamless performance, minimal latency, and maximum availability for services that impact millions of users daily.
About the team
The Device OS team plays a central role in creating innovative devices & services at Lab126. The Device OS team is responsible for the board bring up, low level software, core operating system architecture, innovative framework feature development, associated cloud services and end-to-end system functions that brings these devices to life. The software built by the Device OS team runs on all Amazon consumer electronics devices. The Device OS team also is the center of excellence for developing Cross-OS Software stacks and solutions that helps enable partner teams including developers to drive software convergence and accelerate product development.
• 12+ years software engineering experience, 5+ years experience in engineering management
• Experience leading the design, automation, deployment, and support of large-scale infrastructure
• Proven track record in SRE/DevOps leadership
• Deep expertise in AWS services and cloud architecture
• Experience with observability tools and practices
• Experience in programming with at least one modern language such as Python, Ruby, Golang, Java, C++, C#, Rust
• Experience with Linux/Unix Experience with CI/CD pipelines build processes
• Strong background in automation and Infrastructure as Code
• Proven track record of building and mentoring high-performing SRE/Cloud DevOps teams
• Experience in supporting SRE/Cloud Devops for a global highly scaled/high availability cloud services
Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.