N

Senior Vision Language Model Engineer

salary Salary :

$184,000 - 356,500 yearly

icon building Company : Nvidia
icon briefcase Job Type : Full Time

Number of Applicants

 : 

000+

Click to reveal the number of candidates who applied for this job.
icon loader
Apply Now
icon loader Apply Now

Let AI Supercharge Your Job Hunt!

JobCopilot scans 500,000+ company career sites daily to find jobs for you

Never miss an opportunity Save hours by auto-filling applications forms Land more interviews with tailored applications
happy man
thunder iconActivate JobCopilot

Job Description - Senior Vision Language Model Engineer

NVIDIA is the platform upon which every new AI-powered application is built. We are seeking a senior vision language model engineer to design and build agentic data and training workflows for Autonomous Vehicles, Robotics, and Medical applications. The right person for this role brings technical innovation and collaborative culture to change the way NVIDIA builds dataset search platforms for physical AI developers. Our dataset search offerings are ease to use, performant and scalable. Your work will redefine the dataset search and model training capabilities in NVIDIA product offerings and impact the most iconic companies in Physical AI.

What you'll be doing:

  • Partner with our researchers to develop and evaluate prototypes of our latest models, such as VLMs and VLAs, for video search, video understanding, and more. Enable fundamental advances in autonomous driving, healthcare, and robotics.

  • Design and implement agentic data workflows that automate data discovery, labeling, evaluation, and retraining to maximize development velocity.

  • Build, curate, and maintain high‑quality multimodal datasets (e.g., video, sensor, language/action traces) tailored for end‑to‑end physical AI problems, such as autonomous driving.

  • Explore and productize new data sources including simulation and synthetic data.

  • Use agentic AI workflows across the full applied research lifecycle.

  • Collaborate with research, model development, performance, and product teams.

  • Contribute to NVIDIA Cosmos Dataset Search and other core NVIDIA platforms and products.

What we need to see:

  • PhD with 4+ years, MS with 6+ years, or BS (or equivalent experience) with 8+ years of relevant experience in Computer Science, Computer Engineering, or a related technical field

  • Strong background in modern deep learning, including transformer‑based architectures, video modeling, and multimodal VLM/VLA or foundation models.

  • Excellent experience training and deploying deep learning models on real‑world datasets: data preprocessing, distributed training, evaluation, debugging, and iterative improvement.

  • Excellent experience with python and at least one deep learning framework.

  • Current with the latest research on image and video search in autonomous vehicles, healthcare, robotics, or related physical AI applications.

  • Fluent with agentic AI workflows across the full applied research lifecycle, including prototyping novel algorithms and search pipelines, benchmarking, and integrating prototypes in production codebases.

  • Clear and effective communication skills, with experience working well in a dynamic, product- and research-focused team.

Ways to Stand Out from the Crowd:

  • Strong track record publishing in top-tier conference such as CVPR, NeuRIPS, ICML, ECCV

  • Patents in video retrieval or related field

  • Strong coding architecture skills demonstrated through contributions to large internal or open-source projects.

  • Experience in robotic systems such as autonomous vehicles or humanoid robotics.

Come join us at NVIDIA and contribute to a team that is pushing the edges of what can be done in AI and computer vision. We’re looking for candidates who are innovative, ambitious, and ready to leave a lasting mark on the world!

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 184,000 USD - 287,500 USD for Level 4, and 224,000 USD - 356,500 USD for Level 5.

You will also be eligible for equity and benefits.

Applications for this job will be accepted at least until May 17, 2026.

This posting is for an existing vacancy. 

NVIDIA uses AI tools in its recruiting processes.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Original job Senior Vision Language Model Engineer posted on GrabJobs ©. To flag any issues with this job please use the Report Job button on GrabJobs.
Apply Now
Share Job
Share Job

Auto-Apply to Senior Vision Language Model Engineer Jobs with your AI JobCopilot

thunder icon Auto-Apply with AI

Similar Senior Vision Language Model Engineer Jobs in the US

GrabJobs is the no1 job portal in the US, connecting you to thousands of jobs fast! Find the best jobs in the US, apply in 1 click and get a job today!

Mobile Apps

Copyright © 2026 Grabjobs Pte.Ltd. All Rights Reserved.