Member of Technical Staff AI Research Engineer (Image/Video Foundation Models)

Company : Genpeach Ai

Job Type : Full Time

Zürich, Switzerland

Number of Applicants

000+

Apply Now

Let AI Supercharge Your Job Hunt!

JobCopilot scans 500,000+ company career sites daily to find jobs for you

Never miss an opportunity Save hours by auto-filling applications forms Land more interviews with tailored applications

Activate JobCopilot

Job Description - Member of Technical Staff AI Research Engineer (Image/Video Foundation Models)

About GenPeach AI

GenPeach AI is a product-driven research lab building vertical multimodal foundation models for hyper-realistic human generation in image and video – designed for emotionally resonant, human-centered AI experiences. Our goal is to create tools that supercharge human creativity rather than replace it.

We train models from scratch: proprietary datasets at massive scale, novel architectures and training recipes, large GPU clusters, and tight product integration so research ships to users quickly.

We are a deeply technical team of around 10 people. We’re advised by Directors from Google DeepMind and backed by leading AI-focused funds and angels from OpenAI, Meta AI, Microsoft AI, Project Prometheus, and Fal. Collectively, our team, advisors, and angels have contributed to models including Meta’s Imagine/MovieGen and foundation-model work behind OpenAI’s Sora, plus Google’s Veo and Gemini.

About the Team

You’ll join the research team working across image/video generation and multimodal understanding. You’ll work closely with other Research Engineers and Scientists, as well as Founders and help turn research into scalable training runs, strong evaluations, and production-ready systems.

About the Role

We’re hiring an AI Research Engineer to help build and scale GenPeach’s foundation models end-to-end – from implementing new model ideas and training recipes, to owning the parts of the training stack that determine quality and speed, to pushing models through production constraints.

This is a hands-on, high-ownership role. You’ll write research-grade code that becomes production-critical.

In this role, you will

Implement and iterate on image/video generative model ideas (architecture, losses, conditioning, sampling, pre-training, distillation, post-training)
Own training performance end-to-end (distributed training, throughput, memory, stability, debugging scaling failure modes)
Build the experimentation loop (evals, ablations, reproducibility tooling, reporting, decision hygiene)
Build and improve VLMs for image/video captioning (data recipes, training strategies, model variants, evaluation)
Run high-iteration research: read papers when useful, implement ideas, validate empirically
Create captioning pipelines that improve generation training and product quality
Partner with inference/product to ship under real constraints (latency, cost, reliability, rollout safety)
Build demos and prototypes to showcase capabilities and accelerate iteration

You might thrive in this role if you

Love the craft of experimentation: fast iteration, clear ablations, strong evals, and honest conclusions
Enjoy debugging messy real-world training runs (not just clean demos)
Can move between research and engineering: write clean code, ship utilities, and improve team velocity
Take ownership beyond your job description when needed (startup reality)
Communicate clearly and collaborate well in a small, senior team

Minimum Qualifications

Strong Python and PyTorch skills (4+ years of experience)
Experience implementing and training deep learning models (generative models, VLMs, LLMs, vision/video, or adjacent)
Solid understanding of training dynamics, optimization, and practical debugging
Ability to drive projects end-to-end with minimal supervision

Preferred Qualifications

Hands-on experience with diffusion/flow-based image or video generation, or large-scale generative modeling in adjacent domains
Experience with distributed training at scale (multi-node) and performance tuning (throughput/memory)
Experience building evaluation frameworks (offline metrics + human eval + regression tracking)
Strong intuition for data quality and dataset/labeling tradeoffs for training and captioning
Publications are a plus, but shipped impact and strong technical evidence matter more

What makes this role unique

Build frontier image/video models and the VLM captioning systems that power them
Join a lean, senior team that holds a high engineering + research bar
Direct product impact: your training runs become real user-facing capabilities
Benchmark against the best in the world and compete on model quality through what we ship

How we work

You own outcomes end-to-end and are trusted with real responsibility
Direct, low-ego communication and fast feedback loops
Bias toward impact: measure → iterate → ship
Research discipline: clear ablations, reproducibility, and crisp decision-making

Logistics

Location: Zurich (Switzerland) or Warsaw (Poland) — onsite or hybrid. If you’re elsewhere, we’re open to remote (team/timezone fit considered).
Compensation: competitive salary + meaningful equity (level-dependent)
Interview process: quick screen → 2x technical rounds (practical + systems) → team fit/values

What we offer

Visa sponsorship (where applicable); we’ll make a strong effort to relocate you to Switzerland or Poland if desired
Remote-friendly: work fully remote, hybrid, or on-site from our hubs
Regular offsites and in-person events to collaborate and connect
Flexible PTO

Original job Member of Technical Staff AI Research Engineer (Image/Video Foundation Models) posted on GrabJobs ©. To flag any issues with this job please use the Report Job button on GrabJobs.

Apply Now

Auto-Apply to Similar Jobs

Share Job

Get your Resume Reviewed for Free

Automate Job Applications for Similar Jobs

Auto-Apply to AI Research Engineer Jobs with your AI JobCopilot

Auto-Apply with AI

Similar AI Research Engineer Jobs in Switzerland

Get your Resume Reviewed for Free

Email address

Why are you reporting this job?

I think it’s a discriminatory or offensive

I think it’s fraudulent or a scam

I think it’s trying to sell something unrelated to the job / it’s asking for money

I think it contains incorrect or broken information

Other

All Job Ads are subject to GrabJobs’s Terms of Service. We allow users to flag postings that may be in violation of those terms. Job Ads may also be flagged by GrabJobs moderation team. However, no moderation system is perfect, and flagging a posting does not ensure that it will be removed.

Setup your job alert:

Frequency

By activating job alerts, I agree to GrabJobs Terms & Privacy Policy. I can unsubscribe to job alerts anytime. Skip