الوصف الوظيفي
SWATX is seeking a ML Opps Engineer to be responsible for deploying, managing, and optimizing machine learning models in preproduction and production environments. On classical VM based workloads and containerized environments. With extensive experience on Dataiku and MLFlow.Key Responsibilities: Develop and maintain ML pipelines with experience on Dataiku and MLFlow Have DevOps experience and CI/CD deployment pipelines processes with experience on Azure DevOps primarily and Jenkins being nice to have. Experience on operationalizing compute workloads on classical virtual machines specifically red hat Linux or oracle Linux and docker standalone and Kubernetes environments with preference for OpenShift experience, in addition to experience on deployment and operation of windows based workloads (IIS, windows services…). Monitor model performance metrics and implement strategies for continuous improvement and avoiding model drift. Collaborate with data scientists and engineers to ensure model scalability and reliability. Experience on observability platforms like Prometheus or Grafana Implement best practices for version control, continuous integration, and continuous deployment (CI/CD) for ML models. Optimize, deploy, and run local small LLMs on CPU-based environments, ensuring efficient inference and resource utilization. Qualifications: Bachelor’s degree in Computer Science, Engineering, or a related field. 3+ years of experience in machine learning operations or a related role. Experience with on premise compute landscape especially vmware based compute environments and local saudi cloud platforms (e.g., Nournet,STC and others). Certification on Dataiku is preferred Additional certifications on administration of compute workloads such as CKA are a plus Skills: Extensive experience on Dataiku platform especially on MLOPs Automation and API nodes and experience on MLFlow. Knowledge in Small LLMs and their operationalization and tuning. Proficiency in Python, Docker, Kubernetes, and MLOps tools (MLflow). Knowledge of ML frameworks (e.g. TensorFlow, PyTorch). Strong problem-solving and troubleshooting skills. Experience in .net and C# is a plus Experience in troubleshooting cloudera environment and spark workload and Hadoop, Hive, or Impala is a plus OpenShift experience is a plus