A talented and motivated ML Ops Engineer is needed to design, implement, and maintain the infrastructure supporting the deployment, monitoring, and scaling of machine learning models in production. This role works closely with data scientists, AI researchers, and software engineers to ensure AI solutions are robust, reliable, and scalable.
Key Responsibilities
Model Deployment & Integration
Collaborate with data scientists and engineers to integrate machine learning models into production.
Design and maintain end-to-end ML workflows, including data pipelines, model training, deployment, and monitoring.
Optimize model performance, scalability, and reliability in real-world applications.
Infrastructure & Automation
Develop and automate CI/CD pipelines for machine learning models.
Build and maintain infrastructure for large-scale data storage, model deployment, and real-time analytics.
Work with cloud platforms (AWS, Azure, GCP) for model deployment and orchestration.
Implement monitoring systems to track model performance, detect drift, and ensure data integrity.
Performance Monitoring & Troubleshooting
Ensure model reliability and efficiency in production environments.
Troubleshoot and resolve ML model and deployment issues.
Stay updated on ML Ops best practices and emerging technologies to enhance workflows.
Required Skills & Qualifications
Bachelor's or Master’s degree in Computer Science, Engineering, or Data Science.
Proven experience deploying and maintaining machine learning models in production.
Expertise in ML Ops tools (MLflow, Kubeflow, TFX, Seldon).
Strong proficiency in Python and experience with ML libraries (TensorFlow, PyTorch, Scikit-learn).
Hands-on experience with cloud platforms (AWS, Azure, GCP) and containerization (Docker, Kubernetes).
Familiarity with CI/CD tools (Jenkins, GitLab CI, CircleCI).
Knowledge of data storage solutions (SQL, NoSQL, cloud storage).
Experience in model monitoring, performance tuning, and troubleshooting in production.
Understanding of version control systems (Git) and team collaboration tools.
Preferred Qualifications
Experience with automation frameworks (Terraform, Ansible).
Knowledge of DevOps principles and best practices.
Background in managing and deploying models on large-scale distributed systems.
Familiarity with AI ethics, explainability, and fairness in ML models.
Experience working in fast-paced, research-driven environments.
Why Join?
Work on cutting-edge AI applications with a team of industry experts.
Contribute to real-world AI solutions that drive innovation.
Access to state-of-the-art tools and technologies.
Competitive salary, benefits, and career development opportunities.
A dynamic, collaborative environment that fosters personal and professional growth.
For those passionate about scaling machine learning models and optimizing AI workflows, this role offers an exciting opportunity to make an impact.