As an AI Infrastructure & MLOps Engineer at Müller’s Solutions for a 6-month contract, This role is primarily operations-focused (90%), with hands-on involvement in implementation, configuration, and setup of AI infrastructure and MLOps workflows.
You will play a key role in managing, operating, and guiding the deployment of a strategic AI environment, working closely with the customer as a technical advisor and hands-on engineer.
What about the role responsibilities?
Operate and maintain AI infrastructure and MLOps platforms in a production environment.
Monitor, manage, and troubleshoot Kubernetes-based AI workloads.
Perform Acceptance Testing Planning and Execution for AI infrastructure and platforms.
Ensure stability, performance, and availability of AI systems.
Support day-to-day operational tasks across compute, storage, and networking layers.
Install and configure NVIDIA Enterprise AI Stack (NVAI).
Configure and manage MLOps platforms such as Kubeflow and MLflow.
Assist in setting up end-to-end AI workflows, including data pipelines.
Support the initial implementation phase of the AI environment.
Act as a technical guide and advisor to the customer during the early stages of their AI adoption.
What should you have to fit in this role?
Technical Requirements
AI / MLOps Stack
Proficient experience with the NVIDIA Enterprise AI Stack
Familiarity with Ubuntu Linux
Experience with Kubernetes
Knowledge of Kubeflow / MLflow
Experience with QFLOW (an open-source AI data pipeline management tool)
Programming & Automation
4–6 years of practical experience in:
Python
Jupyter Notebook / JupyterLab
Competence in writing, testing, and maintaining operational scripts and AI workflows.
Infrastructure Experience
Practical experience with enterprise infrastructure, encompassing:
Dell PowerScale (5 nodes)
XE Server (1 node)
Dell R570 Servers (5 nodes)
Dell Network Switches (2 switches)
GPU-based AI servers (in a small-scale environment)
Environment Overview
Initial implementation of AI
Compact configuration:
1 GPU server
1 PowerScale
5 control plane servers
Opportunity to shape best practices from the ground up
To succeed in this role, it's nice to have:
• Familiarity with data frameworks like Apache Spark or Hadoop for data processing.
• Understanding of ML model monitoring and logging practices to ensure system reliability.
• Experience with security best practices in AI systems.
All Job Ads are subject to GrabJobs’s Terms of Service. We allow users to flag postings that may be in violation of those terms. Job Ads may also be flagged by GrabJobs moderation team. However, no moderation system is perfect, and flagging a posting does not ensure that it will be removed.
Be the first to receive the latest Others Full-Time Jobs in the UAE.
Setup your job alert:
By activating job alerts, I agree to GrabJobs Terms & Privacy Policy. I can unsubscribe to job alerts anytime.
Skip
GrabJobs is the no1 job portal in the UAE, connecting you to thousands of jobs fast!
Find the best jobs in the UAE, apply in 1 click and get a job today!