Logo-of-Scandit-hiring-for-jobs-in-Switzerland-on-GrabJobs

Computer Vision Research Internship: Image to Sequence Modeling (e.g. Transformers)

icon building Company : Scandit
icon briefcase Job Type : Internship

Number of Applicants

 : 

000+

Click to reveal the number of candidates who applied for this job.
icon loader
Apply Now
icon loader Apply Now

Let AI Supercharge Your Job Hunt!

JobCopilot scans 500,000+ company career sites daily to find jobs for you

Never miss an opportunity Save hours by auto-filling applications forms Land more interviews with tailored applications
happy man
thunder iconActivate JobCopilot

Job Description - Computer Vision Research Internship: Image to Sequence Modeling (e.g. Transformers)

Duration: Minimum 6 months; ideally 9–12 months, depending on the candidate’s experience


Scandit gives people superpowers. Whether enabling delivery drivers to make quicker deliveries, matching a patient with their medication, or allowing retailers to make store operations more efficient, our technology automates workflows and provides actionable insights to help businesses in a variety of industries. Join us, as we continue to expand, grow and innovate, and help take Scandit to the next level.


About the Internship


We are offering a research-focused internship aimed at advancing machine learning methods for complex visual understanding tasks. The project centers on deep learning architectures for image-to-sequence modelling, such as Transformers, attention mechanisms, and modern sequence and representation-learning frameworks, to address challenging and highly structured computer vision problems. This project contributes to long-term research efforts aimed at achieving even higher performance, robustness, and generalization in large-scale visual applications.


What you will do


You will work closely with experienced ML researchers and engineers on cutting-edge research at the intersection of computer vision and sequence modeling. Your work will include:



  • Designing and experimenting with new ML architectures for structured visual data.

  • Evaluating alternative modeling paradigms (e.g., encoder–decoder, hybrid Transformer models, sequence-based representations).

  • Investigating techniques for improving robustness, generalization, and multi-view reasoning.

  • Running systematic experiments, ablations, and error analyses to validate research hypotheses.


This project provides opportunities for novel model design, extensive experimentation, and scholarly research. You will contribute to long-term innovation in our technology, with potential real-world impact for millions of users. An ideal position for experienced master’s students, PhD collaborations, or candidates preparing for a research career in industry or academia.


Who you are


MSc or PhD student in Computer Science, Machine Learning, Artificial Intelligence, or a related field with a strong research focus. Candidates should have a solid foundation in machine learning theory, neural networks, and computer vision.


Essential Skills:



  • Proficiency in Python and deep learning frameworks such as PyTorch.

  • Practical experience designing, training, and evaluating neural networks, including CNNs and Transformer-based architectures.

  • Strong analytical and problem-solving abilities, with the capability to interpret experimental results and iterate effectively.

  • Familiarity with research best practices, including reproducibility, controlled experiments, and ablation studies.


Desirable Skills:



  • Prior research experience in computer vision, pattern recognition, sequence modeling, or image-to-sequence architectures.

  • Experience training large-scale models or working with foundation-style architectures.

  • Contributions to publications, preprints, or open-source machine learning projects.


Strong communication skills and the ability to work independently in a research-oriented environment.


What We Offer



  • We are certified as a “Great Place to Work” in 10 countries!

  • A highly skilled team and a fun environment where you can put your enthusiasm for computer vision challenges and cutting-edge technologies to use

  • Hackathons, summer parties, company outings and other regular events

  • Office in the city center of Zurich


Who We Are


Could your code give superpowers? Whether enabling delivery drivers to make quicker deliveries, matching a patient with their medication or allowing retailers to make store operations more efficient, our technology automates workflows and provides actionable insights to help businesses in a variety of industries. This means we have no shortage of technical challenges for engineers like you. Join us, as we continue to expand, grow and innovate, and help take Scandit to the next level.


“Everybody is welcome here” - Is a celebrated component of our DNA.


At Scandit we strive to create an inclusive environment that empowers our employees. We believe that our products and services benefit from our diverse backgrounds and experiences and are proud to be a safe space for all.


All qualified applications will receive consideration for employment without regard to race, colour, nationality, religion, sexual orientation, gender, gender identity, age, physical [dis]ability or length of time spent unemployed.


#LI-MB1


#Engineering


#Hybrid

Original job Computer Vision Research Internship: Image to Sequence Modeling (e.g. Transformers) posted on GrabJobs ©. To flag any issues with this job please use the Report Job button on GrabJobs.
Apply Now
Share Job
Share Job

Auto-Apply to Computer Vision Research Internship Jobs with your AI JobCopilot

thunder icon Auto-Apply with AI

Similar Computer Vision Research Internship Jobs in Switzerland

GrabJobs is the no1 job portal in Switzerland, connecting you to thousands of jobs fast! Find the best jobs in Switzerland, apply in 1 click and get a job today!

Mobile Apps

Copyright © 2026 Grabjobs Pte.Ltd. All Rights Reserved.