I

Member of Technical Staff, TPU Performance Engineering

salary Salary :

$400,000 monthly

icon building Company : Inferact
icon briefcase Job Type : Full Time

Number of Applicants

 : 

000+

Click to reveal the number of candidates who applied for this job.
icon loader
Apply Now
icon loader Apply Now

Let AI Supercharge Your Job Hunt!

JobCopilot scans 500,000+ company career sites daily to find jobs for you

Never miss an opportunity Save hours by auto-filling applications forms Land more interviews with tailored applications
happy man
thunder iconActivate JobCopilot

Job Description - Member of Technical Staff, TPU Performance Engineering

Inferact's mission is to grow vLLM as the world's AI inference engine and accelerate AI progress by making inference cheaper and faster. Founded by the creators and core maintainers of vLLM, we sit at the intersection of models and hardware, a position that took years to build.

About the Role

We're looking for a TPU performance engineer to make vLLM a first-class inference engine on Google TPUs. You'll build and optimize TPU backends, compiler integrations, runtime paths, and benchmarking infrastructure using JAX, XLA, Pallas, and related tooling so vLLM can deliver frontier inference performance on TPU hardware.

You'll work at the boundary of inference systems, kernels, compilers, and hardware architecture, improving production-relevant model serving on TPU with clear correctness, latency, and throughput benchmarks. Your work will help make TPU support in vLLM usable, fast, benchmarked, and maintainable.

Skills and Qualifications

Minimum qualifications:

  • Bachelor's degree or equivalent experience in computer science, engineering, systems, machine learning, or similar.

  • Hands-on experience building or optimizing TPU workloads using JAX, XLA, Pallas, or related compiler and runtime tooling.

  • Deep understanding of TPU execution, memory behavior, compilation, and performance constraints for ML workloads.

  • Experience optimizing ML kernels or inference paths such as attention, GEMM, sampling, KV cache, fused kernels, or backend runtime paths.

  • Strong performance profiling and benchmarking skills, with the ability to use measurements, compiler artifacts, correctness tests, and reproducible benchmarks to guide optimization work.

Preferred qualifications:

  • Experience with vLLM, SGLang, TensorRT-LLM, XLA-based serving, or other LLM inference systems.

  • Familiarity with batching, KV cache, decoding, serving tradeoffs, and backend performance constraints in production inference systems.

  • Experience with compiler technologies such as XLA, MLIR, LLVM, Pallas, or other kernel DSLs, including lowering, fusion, and backend code generation.

  • Knowledge of quantization methods such as INT8, FP8, mixed precision, or TPU-specific numeric formats, including accuracy and performance tradeoffs.

Bonus points if you have:

  • Contributed to vLLM, JAX/XLA, Pallas, PyTorch/XLA, compiler projects, or other open-source ML infrastructure.

  • Built TPU benchmarking infrastructure or automated performance regression detection for accelerator workloads.

  • Worked directly with Google TPU ecosystem stakeholders, accelerator platform teams, or early-access programs to ship backend, compiler, or inference performance improvements.

Logistics

  • Location: This role is based in Singapore.

  • Compensation: Depending on background, skills, and experience, the expected annual salary range for this position is S$200,000 to S$400,000 annually + equity.

  • Visa sponsorship: We sponsor visas on a case-by-case basis.

  • Benefits: Inferact offers a generous benefits package, including medical, dental, and vision coverage.

Original job Member of Technical Staff, TPU Performance Engineering posted on GrabJobs ©. To flag any issues with this job please use the Report Job button on GrabJobs.
Apply Now
Share Job
Share Job

Auto-Apply to Member of Technical Staff, TPU Performance Engineering Jobs with your AI JobCopilot

thunder icon Auto-Apply with AI

Similar Member of Technical Staff, TPU Performance Engineering Jobs in Singapore

GrabJobs is the no1 job portal in Singapore, connecting you to thousands of jobs fast! Find the best jobs in Singapore, apply in 1 click and get a job today!

Mobile Apps

Copyright © 2026 Grabjobs Pte.Ltd. All Rights Reserved.