https://bayt.page.link/ZKrJJjL36jk3NtkS8
Create a job alert for similar positions

Job Description

We are looking for a GCP DevOps & Cloud Engineer to support AI/ML projects by designing, implementing, and managing cloud infrastructure on Google Cloud Platform (GCP). The role requires expertise in CI/CD, Kubernetes, Infrastructure as Code (IaC), security, monitoring, and optimizing AI/ML workflows. The ideal candidate will collaborate with data scientists, ML engineers, and software developers to ensure scalable and reliable deployment of AI/ML models.

Key Responsibilities:
GCP Infrastructure & Automation

• Design, deploy, and manage GCP cloud infrastructure for AI/ML applications.

• Implement Infrastructure as Code (IaC) using Terraform, Deployment Manager, or Pulumi.

• Manage and optimize AI/ML workloads using Vertex AI, AI Platform, and BigQuery ML.

CI/CD & MLOps

• Build and maintain CI/CD pipelines for AI/ML model training and deployment using Cloud Build, Jenkins, or GitHub Actions.

• Implement MLOps practices for automated model versioning, deployment, and monitoring.

• Integrate Kubeflow, TensorFlow Extended (TFX), or MLflow for model lifecycle management.

✅ Containerization & Orchestration

• Deploy AI/ML applications using Docker and orchestrate with Kubernetes (GKE).

• Optimize containerized AI workloads for scalability and cost efficiency.

Security & Compliance

• Ensure IAM policies, data encryption, and network security best practices.

• Implement audit logging, monitoring, and access control for AI/ML pipelines.

• Ensure compliance with GDPR, HIPAA, or industry security standards.

Monitoring & Optimization

• Set up monitoring with Google Cloud Operations Suite (Stackdriver), Prometheus, and Grafana.

• Optimize compute, storage, and networking costs for AI workloads.

• Implement logging and alerting strategies for AI model performance monitoring.

AI/ML Ecosystem:

• MLOps Tools: Kubeflow, MLflow, TFX, Vertex AI.

• AI Compute & Storage: AI Platform, BigQuery ML, Cloud TPU, DataFlow, Dataproc.

• Model Deployment: TensorFlow Serving, TorchServe, FastAPI for AI models.



RequirementsExperience:
• 5 years of relevant experience


Preferred Qualifications:

• Google Cloud Certifications:

• Professional Cloud DevOps Engineer

• Professional Cloud Architect

• Professional Machine Learning Engineer (Nice to have)

• Experience working with AI/ML workloads on Google Cloud.

• Strong knowledge of scaling AI models, optimizing ML training jobs, and managing feature stores.



You have reached your limit of 15 Job Alerts. To create a new Job Alert, delete one of your existing Job Alerts first.
Similar jobs alert created successfully. You can manage alerts in settings.
Similar jobs alert disabled successfully. You can manage alerts in settings.