Logo-of-Deeproute.ai-hiring-for-jobs-in-US-on-GrabJobs

Member of Technical Staff (MTS) - Multimodal Foundation Models

icon building Company : Deeproute.ai
icon briefcase Job Type : Full Time

Number of Applicants

 : 

000+

Click to reveal the number of candidates who applied for this job.
icon loader
Apply Now
icon loader Apply Now

Let AI Supercharge Your Job Hunt!

JobCopilot scans 500,000+ company career sites daily to find jobs for you

Never miss an opportunity Save hours by auto-filling applications forms Land more interviews with tailored applications
happy man
thunder iconActivate JobCopilot

Job Description - Member of Technical Staff (MTS) - Multimodal Foundation Models

Description

Focus

Multimodal Foundation Models · Representation Learning · Method Innovation

We are looking for strong technical builders and researchers who deeply understand foundation models and representation learning beyond simply applying existing frameworks.

Ideal candidates should have:

  • Strong experimental rigor
  • Solid systems and modeling intuition
  • Hands-on engineering ability
  • Interest in scalable multimodal AI systems for real-world autonomy

We value people who can bridge research and production, and who care about robustness, scalability, efficiency, and practical deployment in large-scale autonomous driving systems.

Responsibilities

1. Large-Scale Foundation Model Pretraining

  • Develop scalable pretraining pipelines for large-scale multimodal driving data
  • Design and optimize training strategies for:
      • Vision-language-action models
      • Video foundation models
      • Long-context temporal modeling
      • Multimodal representation alignment
  • Improve:
    • Training stability
    • Data efficiency
    • Scaling efficiency
    • Representation robustness
  • Work on distributed training systems and large-scale model optimization using frameworks such as:
    • PyTorch Distributed
    • DeepSpeed
    • Megatron-LM

2. Representation Learning & Method Innovation

  • Design and improve self-supervised and multimodal learning methods for real-world autonomous driving systems
  • Conduct architecture-level research on:
    • Vision Transformers (ViT)
    • Video / temporal architectures
    • Multimodal fusion and alignment
    • Embedding and retrieval systems
    • Long-context and memory-efficient architectures
  • Explore and improve:
    • Pretraining objectives
    • Loss functions
    • Training paradigms
    • Generalization and robustness
  • Analyze model behavior through:
    • Rigorous ablation studies
    • Failure case analysis
  • Representation probing and evaluation

3. Efficient Foundation Models & Scalable Deployment

  • Improve the efficiency, scalability, and deployability of large multimodal foundation models for real-world autonomous driving systems
  • Work on areas such as:
    • Model quantization
    • Knowledge distillation
    • Efficient attention mechanisms
    • Sparse architectures and Mixture-of-Experts (MoE)
    • Long-context and memory-efficient modeling
    • Inference acceleration and serving optimization
    • Training and inference system efficiency
  • Optimize model throughput, latency, memory usage, and deployment performance for large-scale production environments


Requirements
  1. MS or PhD in:
      • Computer Vision
      • Machine Learning
      • Robotics
      • Computer Science
      • Related fields
  2. Strong understanding of:
      • Foundation models
      • Self-supervised learning
      • Representation learning
      • Multimodal learning
      • Large-scale pretraining
  3. Hands-on experience with methods such as:
      • CLIP
      • DINO / DINOv2
      • MAE
      • Contrastive learning
      • Masked modeling
      • MoE or scalable transformer architectures
  4. Experience with one or more of the following is highly valued:
      • Video foundation models
      • Long-context modeling
      • Retrieval systems
      • Efficient inference
      • Distributed training
      • Model compression and deployment optimization
  5. Strong publication record in top-tier venues is preferred:
      • CVPR
      • ICCV
      • ECCV
      • NeurIPS
      • ICLR
      • ICML
Original job Member of Technical Staff (MTS) - Multimodal Foundation Models posted on GrabJobs ©. To flag any issues with this job please use the Report Job button on GrabJobs.
Apply Now
Share Job
Share Job

Auto-Apply to Member of Technical Staff Jobs with your AI JobCopilot

thunder icon Auto-Apply with AI

Similar Member of Technical Staff Jobs in the US

GrabJobs is the no1 job portal in the US, connecting you to thousands of jobs fast! Find the best jobs in the US, apply in 1 click and get a job today!

Mobile Apps

Copyright © 2026 Grabjobs Pte.Ltd. All Rights Reserved.