الوصف الوظيفي
Rackspace is building up its Professional Services Center of Excellence on Application Performance Monitoring Suites. If you enjoy solving complex business problems and can contribute to building next generation of modern applications for our customers helping them understand the connections between application performance, user experience and business outcomes creating amazing customer experiences, with modern interpretations of SRE, Observability usingDatadog, New Relic, AppDynamics or Dynatrace, working with their suite of products and integrations,then join us! Rackspace enables businesses to accelerate digital transformation through our innovative data, integration solutions tools that help you fix problems quickly, maintain complex systems and improve code. We believe Datadog, AppDynamics or New Relicwill be a large contributor to what we do, and we want talented, creative, and thoughtful individuals to join our team to shape Observability Engineering for our customers.You Will:Work with customers and implement Observability solutionsBuild and maintain scalable systems and robust automation that supports engineering goals.Develop and maintain monitoring tools, alerts, and dashboards to provide visibility into system health and performanceProactively gather and analyze both metric and log data from systems and applications to perform anomaly detection, performance tuning, capacity planning and fault isolation.Collaborate with development teams to implement and deploy new features and enhancements, ensuring they meet reliability, security and performance standardsCollaborate with team members to document and share solutions Maintain a deep understanding of the customer’s business as well as their technical environment Identifying performance bottlenecks, identifying anomalous system behavior, and resolving root cause of service issuesYou Have:Bachelor’s degree in engineering/computer science or equivalentSenior-level experience with Site Reliability Engineering, DevOps, Code level application support and troubleshooting, AWS Infrastructure design, implementation and optimization, Automation for deployment, scaling and reliability. Experience with observability solutions tools like Splunk, Datadog, SignalFx, etc.Experience deploying, maintaining and supporting software applications/services in the AWS ecosystemProactive approach to identifying problems and solutionsExperience writing code with one or more interpreted languages such as Python, PHP, Perl, Ruby,Linux ShellExperience with Terraform or Cloud Formation scriptingExperience with configuration management tools like Ansible, Chef or PuppetExperience with standard software development best practices and tools such as code repositories (Git preferred)Experience executing in an agile software development environmentGood understanding of pricing/cost models across AWS services, especially compute, storage, and database offeringsA clear understanding of network & system Management solutionsExcellent organizational and project management skillsExcellent communication, critical thinking & analytical skills