N

Senior Quantization Engineer - Edge AI

icon briefcase Job Type : Full Time

Number of Applicants

 : 

000+

Click to reveal the number of candidates who applied for this job.
icon loader
Apply Now
icon loader Apply Now

Let AI Supercharge Your Job Hunt!

JobCopilot scans 500,000+ company career sites daily to find jobs for you

Never miss an opportunity Save hours by auto-filling applications forms Land more interviews with tailored applications
happy man
thunder iconActivate JobCopilot

Job Description - Senior Quantization Engineer - Edge AI

Senior Quantization Engineer -  Edge AI Model Optimization 


We at NXP have an environment that fosters innovation. Our team has technology experts who understand the big picture and mentors who coach passionate professionals to work on the most exciting challenges. We share responsibilities in everything we do, where every point of view is valued. Join us!

Job Summary

We are seeking a highly skilled Edge AI Engineer/Scientist with a strong theoretical foundation in AI and solid software engineering expertise to contribute to our Edge AI Model Optimization program. While the primary focus of this role is on model quantization, the scope also includes complementary optimization strategies such as speculative decoding, pruning, and other methods for ensuring highly efficient on-device deployment.

You will work at the forefront of innovation, bridging the gap between research and practice, focusing on CNNs, Large Language Model (LLM) and Vision Language Model (VLM) optimization to bring advanced capabilities to NXP’s Ara2 family of NPUs, directly supporting the future of on‑device intelligence.

If you want to  the future of efficient on‑device AI, this is the place to be.

Job Responsibilities

Research: Actively survey the latest research (NeurIPS, ICLR, CVPR) on model optimization/compression, focusing particularly on neural network quantization, but also including other techniques like speculative decoding, pruning, etc. 
Prototyping: Develop and adapt state-of-the-art methods to NXP’s hardware constraints, building POCs to showcase the effectiveness of these techniques on NXP HW.
Production Implementation: Translate research prototypes into robust, optimized production code (C++/Python), ensuring strict memory and compute efficiency standards.
Systems Integration: Document algorithmic tradeoffs, derive deployment recipes, and mentor the engineering team on numerical methods and optimization.
Cross-Functional Leadership: Act as the technical bridge between AI Research, Hardware Engineering and other teams, providing quantified guidance on how choices impact model accuracy and performance.
IP Generation: Contribute to NXP’s intellectual property portfolio through patents and technical publications.

Job Qualifications  

Required Background

Education: MSc or Ph.D (is a plus) in Computer Science, Electrical Engineering, or Mathematics with a focus on Machine Learning or Deep Learning.
AI Expertise: Proven practical experience in AI/ML with a deep understanding of CNN architectures and Generative AI (Transformers, LLMs, VLMs, etc.).
Technical Stack: Strong hands-on experience with PyTorch, ONNX, and model conversion/optimization pipelines.
Software Engineering: Proficient in Python and C++ and best development practices.
Embedded Mindset: Familiarity with the constraints of embedded systems (latency, power, memory bandwidth) and how code interacts with underlying hardware.

Preferred

Advanced AI: Experience with state-of-the-art quantization techniques for discriminative and generative AI (e.g., GPTQ, SpinQuant, etc). 
Hardware Acceleration: Experience with NPUs, device-level profiling, and diagnosing memory bottlenecks.
Kernel Development: Experience with custom kernel development is a plus.
Compilers: Knowledge of MLIR or TVM is a significant plus.


We at NXP have an environment that fosters innovation. Our team has technology experts who understand the big picture and mentors who coach passionate professionals to work on the most exciting challenges. We share responsibilities in everything we do, where every point of view is valued. Join us!

Job Summary

We are seeking a highly skilled Edge AI Engineer/Scientist with a strong theoretical foundation in AI and solid software engineering expertise to contribute to our Edge AI Model Optimization program. While the primary focus of this role is on model quantization, the scope also includes complementary optimization strategies such as speculative decoding, pruning, and other methods for ensuring highly efficient on-device deployment.

You will work at the forefront of innovation, bridging the gap between research and practice, focusing on CNNs, Large Language Model (LLM) and Vision Language Model (VLM) optimization to bring advanced capabilities to NXP’s Ara2 family of NPUs, directly supporting the future of on‑device intelligence.

If you want to  the future of efficient on‑device AI, this is the place to be.

Job Responsibilities

Research: Actively survey the latest research (NeurIPS, ICLR, CVPR) on model optimization/compression, focusing particularly on neural network quantization, but also including other techniques like speculative decoding, pruning, etc. 
Prototyping: Develop and adapt state-of-the-art methods to NXP’s hardware constraints, building POCs to showcase the effectiveness of these techniques on NXP HW.
Production Implementation: Translate research prototypes into robust, optimized production code (C++/Python), ensuring strict memory and compute efficiency standards.
Systems Integration: Document algorithmic tradeoffs, derive deployment recipes, and mentor the engineering team on numerical methods and optimization.
Cross-Functional Leadership: Act as the technical bridge between AI Research, Hardware Engineering and other teams, providing quantified guidance on how choices impact model accuracy and performance.
IP Generation: Contribute to NXP’s intellectual property portfolio through patents and technical publications.

Job Qualifications  

Required Background

Education: MSc or Ph.D (is a plus) in Computer Science, Electrical Engineering, or Mathematics with a focus on Machine Learning or Deep Learning.
AI Expertise: Proven practical experience in AI/ML with a deep understanding of CNN architectures and Generative AI (Transformers, LLMs, VLMs, etc.).
Technical Stack: Strong hands-on experience with PyTorch, ONNX, and model conversion/optimization pipelines.
Software Engineering: Proficient in Python and C++ and best development practices.
Embedded Mindset: Familiarity with the constraints of embedded systems (latency, power, memory bandwidth) and how code interacts with underlying hardware.

Preferred

Advanced AI: Experience with state-of-the-art quantization techniques for discriminative and generative AI (e.g., GPTQ, SpinQuant, etc). 
Hardware Acceleration: Experience with NPUs, device-level profiling, and diagnosing memory bottlenecks.
Kernel Development: Experience with custom kernel development is a plus.
Compilers: Knowledge of MLIR or TVM is a significant plus.


More information about NXP in India...

#LI-29f4
Original job Senior Quantization Engineer - Edge AI posted on GrabJobs ©. To flag any issues with this job please use the Report Job button on GrabJobs.
Apply Now
Share Job
Share Job

Auto-Apply to Quantization Engineer Jobs with your AI JobCopilot

thunder icon Auto-Apply with AI

Similar Quantization Engineer Jobs in India

GrabJobs is the no1 job portal in India, connecting you to thousands of jobs fast! Find the best jobs in India, apply in 1 click and get a job today!

Mobile Apps

Copyright © 2026 Grabjobs Pte.Ltd. All Rights Reserved.