Software Engineer, ML Serving

Company : Rime Labs

Job Type : Full Time

San Francisco, United States

Number of Applicants

000+

Apply Now

Let AI Supercharge Your Job Hunt!

JobCopilot scans 500,000+ company career sites daily to find jobs for you

Never miss an opportunity Save hours by auto-filling applications forms Land more interviews with tailored applications

Activate JobCopilot

Job Description - Software Engineer, ML Serving

Rime is a foundation modeling company that builds voice AI for enterprises running customer experiences at scale. Our models are purpose-built for high-volume conversational deployments, engineered for the accuracy, performance, and deployment flexibility that production environments actually demand.

We started from a different premise than the rest of the field: build voice AI for human connection, not slop. Before we trained a single model, we built our own corpus: full-duplex, studio-quality conversational speech of normal people, recorded and annotated by linguists. It's why our models are unparalleled in naturalism, and it's why enterprises pick Rime when pilots need to make it to production.

Role Overview

We're hiring a Software Engineer to own the serving infrastructure that connects Rime's inference engines to the world. This role sits at the intersection of ML systems and cloud infrastructure — you'll work directly on model inference and cloud infrastructure to build, harden, and scale the systems that stream voice at real-time latency. As Rime moves toward its next-generation architecture, you'll be a core architect of how our models get served.

What You'll Own

Architecture and implementation of Rime's TTS serving infrastructure, from GPU-backed inference engines to the API surface.
Model optimization from a single-node to disaggregated fleet serving.
Compatibility with different NVIDIA hardwares from Hopper to Blackwell and beyond for on-prem and cloud deployments.
Continuous integration and deployment workflows for the model serving pipeline.
Site reliability: on-call rotation, monitoring, alerting, and observability across the serving stack.
Resource provision, cost management across our GPU fleet.

What We're Looking For

Hands-on experience with real-time multinode ML serving infrastructure — ML serving framework experience: NVIDIA Dynamo/Triton, vLLM, SGLang, or equivalent.
Experience with distributed or disaggregated model serving (Tensor Parallel, Pipeline Parallel, or equivalent).
Strong cloud infrastructure fundamentals: Linux internals, networking, containerization (Docker, Kubernetes).
IaC experience — Terraform, Packer, or comparable. You should have opinions about how to do this right.
On-call is part of the job. You treat production reliability as a shared responsibility.

Nice to Have

Experience with multinode training (DDP, FSDP, etc.).
Experience with gRPC or other bidirectional binary streaming protocols.
Experience with audio streaming and related technologies (WebRTC, WebSockets, etc.).
Experience with a multilingual monorepo where you pick the best language out of merit more than personal experience.
Experience with multi-cloud infrastructures (AWS, GCP, OCI, etc.).
Comfort with configuration management tooling (Ansible, Chef, Puppet, or similar).
SRE, DevOps, or platform engineering background at a startup.
Experience at an early-stage company.

Why Join Rime

Build the serving infrastructure behind a category-defining voice AI company from the ground up.
You will bring in experience that no one else currently has at the company: you can help us set the vision.
Direct collaboration with the inference, platform, and ML teams — no handoff culture.
The systems you build determine what experiences our customers can deploy at scale.
Meaningful equity upside at an early stage.
High ownership, high standards, low bureaucracy.
SF / Bay Area.

At Rime, we...

Are outliers
Cut through the hype to focus on the craft
Move fast with agency and freedom
Maintain a growth mindset, finding joy in the struggle
Do the right things, knowing that it'll lead to making money

If that sounds like you too, you'll be a great fit for Rime!

Original job Software Engineer, ML Serving posted on GrabJobs ©. To flag any issues with this job please use the Report Job button on GrabJobs.

Apply Now

Auto-Apply to Similar Jobs

Share Job

Get your Resume Reviewed for Free

Automate Job Applications for Similar Jobs

Auto-Apply to Software Engineer, ML Serving Jobs with your AI JobCopilot

Auto-Apply with AI

Similar Software Engineer, ML Serving Jobs in the US

Get your Resume Reviewed for Free

Email address

Why are you reporting this job?

I think it’s a discriminatory or offensive

I think it’s fraudulent or a scam

I think it’s trying to sell something unrelated to the job / it’s asking for money

I think it contains incorrect or broken information

Other

All Job Ads are subject to GrabJobs’s Terms of Service. We allow users to flag postings that may be in violation of those terms. Job Ads may also be flagged by GrabJobs moderation team. However, no moderation system is perfect, and flagging a posting does not ensure that it will be removed.

Setup your job alert:

Frequency

By activating job alerts, I agree to GrabJobs Terms & Privacy Policy. I can unsubscribe to job alerts anytime. Skip