N

Senior Research Manager, World Model Evaluation

salary Salary :

$272,000 - 431,250 yearly

icon building Company : Nvidia
icon briefcase Job Type : Full Time

Number of Applicants

 : 

000+

Click to reveal the number of candidates who applied for this job.
icon loader
Apply Now
icon loader Apply Now

Let AI Supercharge Your Job Hunt!

JobCopilot scans 500,000+ company career sites daily to find jobs for you

Never miss an opportunity Save hours by auto-filling applications forms Land more interviews with tailored applications
happy man
thunder iconActivate JobCopilot

Job Description - Senior Research Manager, World Model Evaluation

At NVIDIA, we’re not just building the future, we’re generating it! Our world model team is pushing the boundaries of multimodal AI, robotics, and world foundation models for Physical AI. We are looking for a Senior Research Manager to lead world-model evaluation and benchmarking across NVIDIA’s Physical AI model portfolio. This role will build the team and research agenda for evaluating world models through closed-system evaluations, where the model under test is pluggable, and open-system evaluations, where access to model internals enables deeper diagnostics, causal analysis, and mechanistic evaluation.

This is not only about leaderboards. It is about defining what makes a world model useful for Physical AI, discovering model failures, and turning those findings into better data, training recipes, model roadmaps, and downstream systems. The team will build a closed improvement loop across model evaluation, failure discovery, data generation, post-training, and re-evaluation.

What you’ll be doing:

  • Lead a team of Research Scientists focused on world-model evaluation, benchmarking, and diagnostics for NVIDIA Physical AI models, including world foundation models, world-action models, synthetic data generation systems, robotics, simulation, and embodied AI workflows.

  • Define the scientific roadmap for closed-system and open-system evaluation, including open-loop and closed-loop benchmarks, metrics, failure taxonomy, model comparison, and evaluation-to-training feedback loops.

  • Develop benchmarks for physical plausibility, temporal consistency, scene dynamics, object permanence, spatial reasoning, action conditioning, affordances, controllability, long-horizon coherence, SDG quality, and WAM usefulness.

  • Develop open-system and mechanistic evaluation methods using model internals, including representation probing, causal interventions, activation analysis, ablations, sparse autoencoders, attention and feature analysis, and circuit-style diagnostics.

  • Drive evaluation-to-model-improvement loops with training, post-training, data curation, simulation, robotics, SDG, WAM, and applied research teams, including failure discovery, data generation, post-training priorities, model roadmap feedback, and re-evaluation.

  • Publish high-quality papers, technical reports, benchmarks, and open-source evaluation artifacts while establishing rigorous standards for validity, reproducibility, dataset hygiene, leakage prevention, and model comparison.

What we need to see:

  • Strong research background in machine learning, computer vision, multimodal AI, robotics, world models, representation learning, model evaluation, or mechanistic interpretability.

  • Experience leading research teams, research programs, or cross-functional technical initiatives with measurable scientific and product impact.

  • Deep understanding of modern foundation models, including video models, vision-language-action models, diffusion or flow models, self-supervised learning, or world-model architectures.

  • Experience designing serious benchmarks, evaluation datasets, metrics, diagnostic tools, or model analysis frameworks for complex ML systems.

  • Familiarity with world-model evaluation and open-system analysis techniques, such as physical plausibility, temporal consistency, action conditioning, counterfactual reasoning, representation probing, activation patching, causal interventions, sparse autoencoders, or feature attribution.

  • PhD, or equivalent experience in Computer Science, Electrical Engineering, Robotics, Machine Learning, AI, or a related field, with

  • 12+ overall years of relevant research or engineering experience as well as 5+ years of management experience.

  • Ability to work onsite at NVIDIA’s Santa Clara headquarters; this is not a remote position.


Ways to stand out from the crowd:

  • Built influential benchmarks, evaluation suites, model diagnostics, or interpretability tools used by research or production teams.

  • Published in areas such as world models, video generation, physical AI, embodied AI, robotics, representation learning, mechanistic interpretability, self-supervised learning, or model evaluation.

  • Experience evaluating generative video models, action-conditioned world models, robotics foundation models, world-action models, synthetic data generation systems, simulation systems, or vision-language-action models.

  • Strong point of view on what current benchmarks miss, and excitement to build the next generation of evaluation science for Physical AI.


NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people on the planet working for us. If you're creative, passionate and self-motivated, we want to hear from you! NVIDIA is leading the way in groundbreaking developments in Artificial Intelligence, High-Performance Computing and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services.

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 272,000 USD - 431,250 USD.

You will also be eligible for equity and benefits.

Applications for this job will be accepted at least until June 11, 2026.

This posting is for an existing vacancy. 

NVIDIA uses AI tools in its recruiting processes.

NVIDIA is committed to fostering an inclusive work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Original job Senior Research Manager, World Model Evaluation posted on GrabJobs ©. To flag any issues with this job please use the Report Job button on GrabJobs.
Apply Now
Share Job
Share Job

Auto-Apply to Senior Research Manager Jobs with your AI JobCopilot

thunder icon Auto-Apply with AI

Similar Senior Research Manager Jobs in the US

GrabJobs is the no1 job portal in the US, connecting you to thousands of jobs fast! Find the best jobs in the US, apply in 1 click and get a job today!

Mobile Apps

Copyright © 2026 Grabjobs Pte.Ltd. All Rights Reserved.