Responsibilities:
● Work collaboratively with the software engineering team to deploy and operate
systems.
● Help automate and streamline operations and processes using Automation tools like
Ansible, Terraform, etc.
● Build and maintain tools for deployment, monitoring, and operations.
● Monitor performance and optimize it using tools like Grafana, Prometheus, telegraf,
influx, etc.
● Troubleshoot and resolve issues in development, test, and production environments.
● Fine-tune the servers based on CPU, Memory & Network Utilization.
Requirements:
● Strong understanding of Infrastructure as Code tools using Terraform, Ansible, Puppet,
Chef, or an equivalent.
● Strong understanding of scripting languages like Bash, Python, Perl, etc.
● Knowledge of best practices and IT operations.
● Experience with Google Cloud, Amazon Web Services, or IBM Softlayer and
management of clusters and deployments on a very large scale.
● Experience with IAM.
● Experience with Containerization tools like Docker.
● Experience in Linux Administration in production environments.
● Strong understanding of current network protocols, architecture, and design.
● Experience with various monitoring tools and concepts.
● Experience with SQL and MySQL. NoSQL experience is a plus.
● Excellent verbal and written communication skills.