Job Description
The RoleAs an Engineer II for the Scale & Performance team, you will play a critical role in ensuring the scalability, performance, and reliability of HashiCorp’s cloud and enterprise offerings. Your work will be central to enhancing system resilience, optimizing performance at scale, and ensuring HashiCorp’s products deliver high availability in dynamic cloud environments. Your experience in Performance engineering, or systems engineering, or reliability engineering or a related field, you will lead efforts to identify performance bottlenecks, address, and mitigate operational challenges before they impact our customers. Your expertise in load testing, performance analysis, and system hardening will ensure that our services meet the highest standards of scale and performance excellence.
You’ll have the opportunity to dive deep into the architecture of HashiCorp’s products, including both our cloud and enterprise offerings. You’ll take ownership of building and maintaining an advanced automation framework that powers ephemeral, scalable environments, enabling controlled scaling efforts and performance regression testing.
Your work will directly impact how we validate and optimize performance across our systems. From spinning up environments to scaling them dynamically and tearing them down on demand, you’ll own the end-to-end lifecycle of our test engines. Beyond that, you’ll play an important role in analysing results, creating insightful dashboards, and delivering actionable reports to help teams identify and resolve performance bottlenecks and throttling issues.
What you’ll do (responsibilities)
- Implement best practices for system reliability, including proactive identification of potential failure points and the development of automated mitigations
- Design and execute comprehensive performance testing strategies to identify performance bottlenecks and scalability limits across our cloud products
- Work with the engineering teams to identify potential application and infrastructure bottlenecks and suggest changes.
- Work closely with engineering and product teams to integrate scale and performance readiness into the development lifecycle, enhancing product stability and user satisfaction.
- Build and refine tools and frameworks for automated testing, environment simulation, and incident reproduction, reducing manual effort and increasing test coverage.
- Conduct in-depth analysis of testing results, documenting findings and making actionable recommendations for system enhancements.
- Drive Systemic Improvements to the products by introducing Chaos Testing and partnering with product development teams.
- Share your knowledge and expertise with team members, fostering a culture of learning and continuous improvement.
- Develop and implement disaster recovery and backup strategies to ensure data integrity and system resilience.
What you’ll need (basic qualifications)
- 4+ years of experience in performance engineering, systems engineering, reliability engineering or non functional testing roles with a focus on performance testing, load testing or system scalability.
- Strong programming skills in Python / Golang and exposure to scripting languages like javascript or shell script
- Experience with version control systems such as Git.
- Strong experience with performance testing tools like K6, Artillery, Vegeta, Locust etc or similar tools for deriving key performance metrics for a product
- Proven track record of leading successful performance testing and optimization initiatives in cloud and on-prem environments.
- Experience in creating and managing test environments for automated testing.
- Experience in creating CI/CD pipelines and maintaining quality gates for system testing.
- Understanding of monitoring and observability tools such as Datadog or Prometheus to develop dashboards indicating metrics that accurately reflect system performance and load break points and regressions.
- Exposure to cloud technologies ( AWS, Azure, Or GCP) and container technologies like Nomad or Kubernetes and/Or working in a Hybrid cloud environment.
- Effective communication and collaboration skills, capable of working with cross-functional teams and articulating technical concepts to diverse audiences.
What's nice to have (preferred qualifications)
- You have experience using HashiCorp products (Terraform, Packer, Waypoint, Nomad, Vault, Boundary, Consul).
- Experience with Javascript development / using any test framework based on Java script is a plus.
- Experience in driving systemic improvements through Chaos engineering is a plus. #LI-Hybrid
“HashiCorp is an IBM subsidiary which has been acquired by IBM and will be integrated into the IBM organization. HashiCorp will be the hiring entity. By proceeding with this application you understand that HashiCorp will share your personal information with other IBM subsidiaries involved in your recruitment process, wherever these are located. More information on how IBM protects your personal information, including the safeguards in case of cross-border data transfer, are available here: link to IBM privacy statement.”