Job Description
Fortinet, founded over 20 years ago, has become a driving force in the evolution of cybersecurity and the convergence of networking and security. Our mission is to secure people, devices, and data everywhere.
What You Will Do:
- Reliability & Performance Optimization
- Ensure high availability of services by implementing best practices in monitoring, alerting, and incident management.
- Conduct capacity planning, performance tuning, and load testing to optimize system efficiency.
- Implement self-healing mechanisms to minimize downtime and improve fault tolerance.
- 2. Infrastructure & Automation
- Design, implement, and maintain scalable Kubernetes clusters across multiple environments.
- Automate infrastructure provisioning using Terraform, Helm, or Ansible.
- Manage CI/CD pipelines to streamline deployments and reduce manual interventions.
- 3. Monitoring & Incident Response
- Develop and maintain observability solutions using Prometheus, Grafana, ELK, or OpenTelemetry.
- Set up automated alerting and on-call rotations to ensure proactive issue resolution.
- Perform root cause analysis (RCA) and post-mortems to drive continuous improvements.
- 4. Security & Compliance
- Enforce security best practices, including IAM policies, network security, and container security.
- Ensure compliance with industry standards (SOC 2, ISO 27001, etc.) through automated security checks.
- Collaboration & Documentation
- Work closely with development, security, and operations teams to align reliability goals with business needs.
- Maintain clear and up-to-date runbooks, incident reports, and system documentation.
We Are Looking for:
- 3+ years of experience in SRE, DevOps, or Cloud Infrastructure roles.
- Strong knowledge of Kubernetes, Docker, and container orchestration.
- Proficiency in cloud platforms (AWS, GCP, or Azure) and infrastructure as code tools like Terraform.
- Experience with observability tools (Prometheus, Grafana, Datadog, New Relic, etc.).
- Expertise in CI/CD pipelines (Jenkins, GitHub Actions, ArgoCD, Flux, etc.).
- Proficiency in scripting & automation (Python, Go, Bash).
- Understanding of networking, load balancing, DNS, and security best practices.
- Excellent troubleshooting skills with a focus on incident resolution and post-mortem analysis.
Preferred Qualifications
- Experience with multi-cluster Kubernetes management and service mesh (Istio, Linkerd).
- Familiarity with GitOps workflows and policy-driven infrastructure.
- Knowledge of machine learning-driven anomaly detection for proactive monitoring.
Working Conditions:
This position requires working from the office full-time; remote work is not available.
Company Culture:
At Fortinet, we foster a culture of innovation, collaboration, and continuous learning. We are committed to creating an inclusive environment where all employees feel valued and respected.
We encourage candidates from all backgrounds and identities to apply. We offer a competitive Total Rewards package to support you in managing your overall health and financial well-being, flexible work arrangements, and a supportive work environment. If you aspire to experience a challenging, enjoyable, and rewarding career journey, we invite you to consider joining us and bringing solutions that make meaningful and lasting impact to our 660,000+ customers around the globe