$150,000 - 270,000 yearly
Number of Applicants
:000+
Let AI Supercharge Your Job Hunt!
JobCopilot scans 500,000+ company career sites daily to find jobs for you
At Modal, we build foundational technology, including an optimized container runtime, a GPU-aware scheduler, and a distributed file system.
We're a small team based out of New York, Stockholm and San Francisco, and have raised over $23M. Our team includes creators of popular open-source projects (e.g., Seaborn, Luigi), academic researchers, international olympiad medalists, and experienced engineering and product leaders with decades of experience.
We are looking for strong engineers with experience in making ML systems performant at scale. If you are interested in contributing to open-source projects and Modal’s container runtime to push language and diffusion models towards higher throughput and lower latency, we’d love to hear from you!
5+ years of experience writing high-quality, high-performance code.
Experience working with torch, high-level ML frameworks, and inference engines (vLLM or TensorRT).
Familiarity with Nvidia GPU architecture and CUDA.
Experience with ML performance engineering (tell us a story about boosting GPU performance — debugging SM occupancy issues, rewriting an algorithm to be compute-bound, eliminating host overhead, etc).
Nice-to-have: familiarity with low-level operating system foundations (Linux kernel, file systems, containers, etc).
Ability to work in-person, in our NYC, San Francisco or Stockholm office.
Auto-Apply to Member of Technical Staff Jobs with your AI JobCopilot
Copyright © 2026 Grabjobs Pte.Ltd. All Rights Reserved.