Vision Language Model Engineer

Company : EchoTwin AI

Job Description - Vision Language Model Engineer

Company Overview

EchoTwin AI is pioneering AI-driven infrastructure intelligence, redefining how cities are managed. Powered by a proprietary visual intelligence engine with full spatial reasoning, EchoTwin transforms municipal fleets into mobile urban sensors—creating living digital twins that provide real-time insights into infrastructure, compliance, and safety. By enabling municipalities to proactively monitor, predict, and resolve issues, EchoTwin helps build resilient, self-healing, and sustainable urban ecosystems. More than “smart cities,” EchoTwin is advancing the era of cognizant cities—urban environments with the awareness to see, think, and act on challenges in real time.

What You’ll Do

As a Vision Language Model Engineer, you will design, develop, and optimize advanced vision-language models that integrate visual and textual data to enable intelligent systems. You will work closely with cross-functional teams to build models that power applications such as image captioning, visual question answering, and multimodal AI at the edge.

Key Responsibilities

Design and implement state-of-the-art vision-language models using deep learning frameworks.
Develop and fine-tune models that combine computer vision and natural language processing for tasks like image captioning, visual question answering, and text-to-image generation.
Collaborate with data scientists and software engineers to integrate models into production systems.
Optimize model performance for accuracy, latency, and scalability in real-world applications.
Conduct experiments to evaluate model performance and iterate on architectures and training pipelines.
Stay up-to-date with the latest research in vision-language models and incorporate advancements into projects.
Contribute to data preprocessing, augmentation, and annotation pipelines for multimodal datasets.
Document model development processes and present findings to technical and non-technical stakeholders.

Qualifications

Bachelor’s, Master’s or Ph.D. in Computer Science, Machine Learning, Artificial Intelligence, or a related field (or equivalent experience).
3+ years of experience in machine learning, with a focus on vision-language models or multimodal AI.
Hands-on experience with deep learning frameworks such as PyTorch or TensorFlow.
Proven track record of building and deploying computer vision and/or NLP models.
Proficiency in Python and relevant ML libraries (e.g., Hugging Face, OpenCV, Transformers).
Experience with large-scale model training and optimization (e.g., distributed training, quantization).
Strong understanding of neural network architectures (e.g., CNNs, Transformers, CLIP, or similar).
Experience with multimodal datasets and preprocessing techniques for images and text.
Familiarity with cloud platforms (e.g., AWS, GCP, Azure) and model deployment workflows.
Strong problem-solving skills and ability to work in a fast-paced, collaborative environment.
Excellent communication skills to explain complex technical concepts to diverse audiences.

Benefits and Perks

There are endless learning and development opportunities from a highly diverse and talented peer group, including experts in various fields, including Computer Vision, GenAI, Digital Twin, Government Contracting, Systems and Device Engineering, Operations, Communications, and more!

Options for medical, dental, and vision coverage for employees and dependents (for US employees)
Flexible Spending Account (FSA) and Dependent Care Flexible Spending Account (DCFSA)
401(k) with 3% company matching
Unlimited PTO
Profit sharing

Please do not forward resumes to our jobs alias, EchoTwin AI employees, or any other company location. EchoTwin AI is not responsible for any fees related to unsolicited resumes.

Original job Vision Language Model Engineer posted on GrabJobs ©. To flag any issues with this job please use the Report Job button on GrabJobs.

Share Job

Get your Resume Reviewed for Free

Similar Vision Language Model Engineer Jobs in the US

Get your Resume Reviewed for Free

Email address

Why are you reporting this job?

I think it’s a discriminatory or offensive

I think it’s fraudulent or a scam

I think it’s trying to sell something unrelated to the job / it’s asking for money

I think it contains incorrect or broken information

Other

All Job Ads are subject to GrabJobs’s Terms of Service. We allow users to flag postings that may be in violation of those terms. Job Ads may also be flagged by GrabJobs moderation team. However, no moderation system is perfect, and flagging a posting does not ensure that it will be removed.

Setup your job alert:

Frequency

By activating job alerts, I agree to GrabJobs Terms & Privacy Policy. I can unsubscribe to job alerts anytime. Skip