Job Summary:
We are seeking a highly skilled Platform Architect with a strong focus on DevOps and Site Reliability Engineering (SRE) practices. The ideal candidate will design and implement scalable, secure, and high-performing cloud platforms, enabling teams to deliver software efficiently and reliably. This role involves collaborating with engineering, operations, and security teams to build and optimize infrastructure and processes that align with business goals.This role involves collaborating with development, operations, and testing teams to deploy code releases, manage infrastructure, and ensure continuous integration and delivery (CI/CD) pipelines are efficient, reliable, and secure.
Key Responsibilities:
Platform Design & Architecture:
- Design scalable, fault-tolerant, and cost-efficient cloud architectures.
- Define and implement platform strategies that support DevOps and SRE best practices.
- Evaluate and integrate container orchestration systems like Kubernetes and service mesh architectures.
Reliability engineering:
- Optimize cloud resource utilization and implement cost-saving strategies.
- Oversee security and compliance for cloud and platform environments.
- Drive the adoption of immutable infrastructure and containerization best practices.
DevOps Strategy and Implementation:
- Develop and implement DevOps strategies to improve deployment frequency, lead time for changes, and mean time to recovery (MTTR).
- Design and architect CI/CD pipelines for automated deployment of applications and infrastructure changes.
- Implement infrastructure as code (IaC) practices using tools such as Terraform, CloudFormation, or Ansible to automate provisioning and configuration management.
- Lead efforts to containerize applications using Docker and orchestrate using Kubernetes or similar tools.
Automation and Tooling:
- Identify and implement automation opportunities throughout the development and operations lifecycle.
- Develop scripts and automation tools to streamline operational processes and tasks.
- Implement and maintain monitoring, logging, and alerting tools to ensure proactive monitoring of infrastructure and applications.
Cloud Infrastructure Management:
- Manage cloud environments (AWS, Azure, Google Cloud) and leverage cloud-native services to optimize infrastructure performance and cost.
- Ensure security, scalability, and reliability of cloud infrastructure deployments.
- Implement best practices for cloud security, including IAM policies, encryption, and network security configurations.
Collaboration and Team Support:
- Act as a technical advisor to cross-functional teams, ensuring alignment with platform goals.
- Work closely with development teams to integrate CI/CD pipelines into the software development lifecycle (SDLC).
- Collaborate with operations teams to manage and monitor production environments and resolve incidents promptly.
- Provide technical guidance and support to development and operations teams on DevOps practices, tools, and technologies.
Continuous Improvement:
- Stay current with industry trends and advancements in DevOps, cloud computing, and automation.
- Identify opportunities for process improvement and optimization of CI/CD pipelines.
- Conduct post-implementation reviews and retrospectives to drive continuous improvement.
Training and Documentation:
- Develop and maintain documentation for DevOps processes, procedures, and best practices.
- Provide training and mentorship to team members on DevOps tools, practices, and methodologies.
Qualifications:
Education and Experience:
- Bachelor’s degree in Information Technology, Computer Science, or a related field.
- Minimum of 8 years of experience in DevOps, SRE , software development, or systems administration roles.
- Proven track record of designing and implementing CI/CD pipelines and automating infrastructure management.
Technical Skills:
- Strong knowledge of DevOps principles and practices, including continuous integration, continuous delivery, and continuous deployment.
- Proficiency in CI/CD tools such as Jenkins, GitLab CI/CD, or CircleCI.
- Experience with infrastructure as code (IaC) tools such as Terraform, CloudFormation, or Ansible.
- Hands-on experience with containerization technologies and orchestration tools (e.g., Docker, Kubernetes).
- Familiarity with cloud platforms (AWS, Azure, Google Cloud) and cloud-native services.
Soft Skills:
- Excellent communication and interpersonal skills.
- Strong leadership and team collaboration abilities.
- Analytical mindset with problem-solving skills.
- Ability to work effectively in a fast-paced, dynamic environment.
- Strong organizational and multitasking abilities.
Certifications:
- Relevant certifications such as AWS Certified DevOps Engineer, Azure DevOps Engineer Expert, Certified Kubernetes Administrator (CKA), or similar are highly desirable.
- Additional certifications in ITIL, Agile/Scrum, or other relevant areas are a plus.