R

Datacentre Operations Engineer

icon building Company : Radiant
icon briefcase Job Type : Full Time

Number of Applicants

 : 

000+

Click to reveal the number of candidates who applied for this job.
icon loader
Apply Now
icon loader Apply Now

Let AI Supercharge Your Job Hunt!

JobCopilot scans 500,000+ company career sites daily to find jobs for you

Never miss an opportunity Save hours by auto-filling applications forms Land more interviews with tailored applications
happy man
thunder iconActivate JobCopilot

Job Description - Datacentre Operations Engineer

About Us

We’re a fast-growing GPU-as-a-Service provider, delivering scalable, high-performance compute infrastructure purpose-built for AI and HPC workloads. Operating across global data centres, we run mission-critical environments where uptime, throughput, and ultra-low latency are non-negotiable.

Role Overview

We’re looking for a qualified, experienced Datacentre/Hardware Engineer to run our muli-million dollar HPC infrastructure based in Dallas Fort Worth, US. You’ll be well versed with managing and optimising datacentres, dealing promptly with hardware failures, optimising environmental performance as well as deploying new hardware and services 24/7 x 365. You’ll be hands on with high performing HPC compute and will operate with utmost diligence, professionalism and focus to ensure the equipment underpinning our services operate at peak performance.

Key Responsibilities

  • Troubleshooting and Support: Quickly diagnose and resolve hardware and network issues to maximise uptime.

  • Respond to critical hardware alerts via our monitoring and observability platform. Contribute to ongoing service improvement to improve our monitoring capability

  • RMA and Support: Manage vendor relationships, handling RMAs and support requests within Ori’s Service Level Objectives (SLOs) to meet customer contract SLAs.

  • Data Center Management: Guide data center acquisition, setup, and ongoing maintenance, fostering compliance and leveraging strong vendor partnerships.

  • Fully own acquisition of hardware assets from the point of purchase and delivery, through lifecycle management and disposal - all while owning asset management within ORI’s CMDB system.

  • Hardware Installation and Maintenance: Deploy and maintain HPC and AI hardware for uninterrupted operations, including performing low-level system maintenance such as hardware troubleshooting, firmware updates, and replacement of components as needed.

  • Datacenter Environment Technologies: Oversee cooling, power distribution, and other critical data center technologies to maintain high operational standards.

  • Capacity Planning and Resource Allocation: Support strategic planning to align infrastructure capabilities with current and projected demands.

  • Develop and maintain datacentre/hardware management SOP’s ensuring continual alignment with ORI’s governance and compliance requirements

  • Apply ITSM frameworks: Incident, Major Incident, Change Management, and service improvement.

  • Operate and support services 24x7x365 for production environments, including on-call rotation

  • Contribute to Incident postmortem analyses, root cause analysis, document learnings, and automate remediations

  • Mentor junior engineers and act as an Operational requirements consultant to other departments

  • Communicate technical decisions clearly to non-technical stakeholders and customers

  • Uphold a culture of: do, document, automate

  • Willing to cross train and upskill in Infrastructure/Platform SRE practises.

  • Willing to travel across North America to support future datacentre onboarding and deployments.

Essential Skills & Experience

  • Degree in Computer Science, or 10 years industry experience.

  • 3+ years of experience in data center operations, HPC, or related roles.

  • Proven track record working with HPC Nvidia GPU or equivalent systems, high-performance storage, and networking.

  • Expertise in hardware installation, network configuration, and low-level system maintenance, including hardware troubleshooting and firmware management.

  • Knowledge of data center environment technologies, including cooling and power distribution.

  • Experience in data center design, greenfield deployments, and operations.

  • Strong understanding of hardware and spares management, with the ability to handle RMAs and support cases within defined SLOs to meet SLA requirements.

  • Solid understanding of HPC and AI workloads.

  • Strong problem-solving abilities and the resilience to thrive in a fast-paced environment.

  • Excellent communication skills and ability to collaborate with cross-functional, internationally dispersed teams.

  • Strong grasp of ITSM and service operation best practices

  • Excellent communication and mentorship skills

  • Comfortable interfacing with internal stakeholders and external customers

  • Bonus: Specific vendor endorsed qualifications from Supermicro or Dell for HGX based systems

Preferred Qualifications

  • Knowledge of large scale private cloud deployments and capacity planning.

  • Qualifications in HVAC management and deployments

  • Certifications in relevant areas - Hardware, Networking

  • ITIL Foundation level qualification or equivalent experience

Original job Datacentre Operations Engineer posted on GrabJobs ©. To flag any issues with this job please use the Report Job button on GrabJobs.
Apply Now
Share Job
Share Job

Auto-Apply to Datacentre Operations Engineer Jobs with your AI JobCopilot

thunder icon Auto-Apply with AI

Similar Datacentre Operations Engineer Jobs in the US

GrabJobs is the no1 job portal in the US, connecting you to thousands of jobs fast! Find the best jobs in the US, apply in 1 click and get a job today!

Mobile Apps

Copyright © 2026 Grabjobs Pte.Ltd. All Rights Reserved.