H

Storage Engineer

salary Salary :

$200,000 - 300,000 monthly

icon building Company : Hydra Host
icon briefcase Job Type : Full Time
icon remote-alt Remote / Work from Home

Number of Applicants

 : 

000+

Click to reveal the number of candidates who applied for this job.
icon loader
icon loader

Let AI Supercharge Your Job Hunt!

JobCopilot scans 500,000+ company career sites daily to find jobs for you

Never miss an opportunity Save hours by auto-filling applications forms Land more interviews with tailored applications
happy man
thunder iconActivate JobCopilot

Job Description - Storage Engineer

Job Title: Storage Engineer

 

About Hydra Host 

Hydra Host is a Founders Fund-backed NVIDIA cloud partner building the infrastructure platform that powers AI at scale. We connect AI Factories - high-performance GPU data centers - with the teams that depend on them: research labs training foundation models, enterprises running production inference, and developer platforms demanding scalable compute capacity.  Hydra Host is building the next-generation bare-metal GPU infrastructure network and marketplace under its Brokkr platform. The company enables independent data centers to monetize GPU capacity while providing enterprises with scalable, high-performance access to NVIDIA-based compute (e.g., H100, H200, B200, L40S, RTX 4090). As we expand our infrastructure capabilities, Hydra Host is now seeking a Storage Engineer to lead the architecture, development, and deployment of our next-generation AI/HPC storage platform.

 

The role:

As a Storage Engineer, you will be responsible for designing and building Hydra Host’s first production-grade storage platform from the ground up, supporting the company’s rapidly expanding network of bare-metal GPU clusters.

You’ll own the architecture, technology selection, implementation, and evolution of this platform, defining how Hydra Host manages data for large-scale, distributed AI workloads across global data centers.

This is a senior, hands-on role for an engineer who has built storage systems for GPU clusters before, with deep expertise in both block and object storage and a strong understanding of parallel file systems, performance optimization, and large-scale orchestration.

 

Key Responsibilities

· Define, architect, and implement Hydra Host’s first production storage platform tailored for bare-metal GPU clusters and AI/HPC workloads.

· Lead all technical decisions around storage stack design, from hardware infrastructure to parallel file system orchestration and performance tuning.

· Select, build, and maintain storage solutions spanning both block (NVMe, SAN, Ceph, etc.) and object storage (S3-compatible, custom, or Ceph Object Gateway) layers.

· Design for high-throughput, low-latency access, supporting large datasets, rapid checkpointing, and parallel access for distributed AI training workloads.

· Integrate and optimize parallel file systems such as Lustre, BeeGFS, Spectrum Scale, WekaIO, or CephFS, ensuring maximum performance and fault tolerance.

· Ensure compatibility across Hydra’s diverse GPU/OEM ecosystem, accounting for unique firmware, BMC/Redfish APIs, and hardware configurations.

· Develop automation, observability, and management tooling for storage, focusing on reliability, scalability, and efficiency.

· Act as a builder and architect: deeply hands-on in deployment, troubleshooting, and optimization, while guiding long-term storage roadmap.

· Collaborate cross-functionally with GPU, HPC, and platform engineering teams to integrate storage with compute and network layers.

· Interface with customers and product leadership to define feature priorities, performance benchmarks, and future enhancements.

Must-Have Qualifications

· 8+ years of progressive, hands-on experience designing and implementing high-performance storage systems for compute clusters in HPC, AI, or bare-metal cloud environments.

· Proven track record building storage infrastructure from scratch, not just operating existing systems.

· Deep expertise in block storage (NVMe, SAN, Ceph, distributed block systems) and object storage (S3, MinIO, Ceph Object Gateway, etc.).

· Strong background in parallel file systems (WekaIO, BeeGFS, Lustre, Spectrum Scale, or similar) supporting GPU or AI cluster workloads.

· Solid foundation in Linux systems engineering, automation, and scripting for distributed environments.

· Familiarity with BMC, Redfish APIs, and OEM server firmware for bare-metal management.

· Deep understanding of AI/ML data pipelines: model checkpointing, data locality, and multi-tiered storage optimization.

· Excellent problem-solving, debugging, and communication skills, able to translate technical decisions into clear architectural direction.

 

Preferred Qualifications

· Experience building storage solutions for large-scale GPU or HPC infrastructure.

· History of technical leadership or mentorship, growing teams or owning a product roadmap.

· Experience evaluating and managing vendor relationships and negotiating storage hardware/software contracts.

· Contributions to open-source HPC or storage projects (Ceph, Lustre, BeeGFS, etc.).

· Familiarity with confidential computing, secure data handling, or high-availability architectures.

Original job Storage Engineer posted on GrabJobs ©. To flag any issues with this job please use the Report Job button on GrabJobs.
Share Job
Share Job

Auto-Apply to Storage Engineer Jobs with your AI JobCopilot

thunder icon Auto-Apply with AI

Similar Storage Engineer Jobs in the US

GrabJobs is the no1 job portal in the US, connecting you to thousands of jobs fast! Find the best jobs in the US, apply in 1 click and get a job today!

Mobile Apps

Copyright © 2026 Grabjobs Pte.Ltd. All Rights Reserved.