Job Requisition ID #
Position Overview
Want to help make a better world? As a Principal Site Reliability Engineer (SRE) Autodesk you can do just that. How is this possible? As a member of the team responsible for operating critical customer facing services. You will have the opportunity to contribute to and drive improvements in the operation of mission critical components that make up and are dependencies of hundreds Autodesk desktop, mobile and web applications. These services are key business enablers serving millions of customers every day. The responsibilities of this role are part of the foundation for attaining and maintaining our customers trust to build their business around Autodesk’s commercial offerings.
As a Principal Site Reliability Engineer you will serve as a primary point responsible for building the SRE practice that is focused on the overall health, availability, performance, and capacity of one or more of our production services. In addition you will be responsible for building and operating our “Gameday” practices that will ensure our systems operate as designed and identify opportunities for continuous improvement. The role will also partner with development teams through the various stages of development to ensure systems are designed with both function and scaled operations in mind. The ideal candidate will be passionate about operations and with an “Automation first” mindset to drive scale.
Responsibilities
Build, maintain the site reliability practice which includes the mentorship of SREs in the organization
Scale and enhance our gameday practice by growing the scope and complexity of a critical program to validate and improve our reliability and customer trust.
Serve as a primary point responsible for the overall health, performance, and capacity of one or more of our services
Work closely with development teams to ensure that platforms are designed with "operability" in mind.
Gain deep knowledge of both our complex internally developed applications and enterprise-class services.
Operate and maintain highly available production systems
Develop tools to improve our ability to rapidly deploy and effectively monitor custom applications in a large-scale Linux and Windows environment
Participate in a 24x7 rotation for second-tier escalations.
Continuously look for opportunities to automate changes and implement them
Keep supported services compliant with the company and regulatory requirements including but not limited to security, privacy, SOC2, FedRamp
Implement and improve monitoring and alerting
Build, automate, and improve observability dashboards to provide better visibility in the operational aspects of the systems
Function well in a fast-paced, rapidly-changing environment.
Minimum Qualifications
B.S. or higher in Computer Science or other technical discipline, or related practical experience.
10+ years of experience in the following areas
Commercial cloud experience building and maintaining AWS and/or Azure offerings for large scale enterprises
Compute: Cloud based Unix and/or windows administration
Storage: Cloud based storage provisioning, administration
Development Languages: Working knowledge of development and scripting languages
Database Technologies: Cloud based database administration
Monitoring/Logging tools techniques and configuration
Knowledge of Information Security Best Practices
Excellent written and verbal communication skills
Preferred Qualifications:
8+ Years of hands on experience with multiples of these example technologies:
Compute: N and N-2 cloud based windows and Linux operating systems
EC2, ElastiCache, Cloud Front, Auto Scaling, Containers, API gateways
Storage: AWS S3, EFS, EBS
Development Languages: Java, Python, Node JS, Perl, Java Script
Database Technologies: MSSQL, MYSQL, AWS AuroraDB, AWS DynamoDB, AWS Postgres
Networking: Load balancers (ALB/ELB), SSL/TLS, DNS, Firewall.
Monitoring: Splunk, Grafana, Dynatrace, Data Dog, LogicMonitor
Scripting languages: Python, PowerShell, Bash; specifically for systems automation
Experience in 24x7 support of the highly available production systems with experience in keeping stakeholders informed
Keen eye to learn and improve from the incidents
Strong interpersonal communication skills (including listening, speaking, and writing) and ability to work well in a diverse, team-focused environment with other SREs, Engineers, Operators, Product Managers, etc.
Passion to run and improve the customer facing systems with high degree of availability (four 9’s)
Learn More
About Autodesk
Welcome to Autodesk! Amazing things are created every day with our software – from the greenest buildings and cleanest cars to the smartest factories and biggest hit movies. We help innovators turn their ideas into reality, transforming not only how things are made, but what can be made.
We take great pride in our culture here at Autodesk – our Culture Code is at the core of everything we do. Our values and ways of working help our people thrive and realize their potential, which leads to even better outcomes for our customers.
When you’re an Autodesker, you can be your whole, authentic self and do meaningful work that helps build a better future for all. Ready to shape the world and your future? Join us!
Salary transparency
Diversity & Belonging
We take pride in cultivating a culture of belonging and an equitable workplace where everyone can thrive. Learn more here: https://www.autodesk.com/company/diversity-and-belonging
Are you an existing contractor or consultant with Autodesk?
Please search for open jobs and apply internally (not on this external site).