Logo-of-Institute-Of-Foundation-Models-hiring-for-jobs-in-US-on-GrabJobs

Inference Optimization Intern Performance Modeling

icon briefcase Job Type : Internship

Number of Applicants

 : 

000+

Click to reveal the number of candidates who applied for this job.
icon loader
Apply Now
icon loader Apply Now

Let AI Supercharge Your Job Hunt!

JobCopilot scans 500,000+ company career sites daily to find jobs for you

Never miss an opportunity Save hours by auto-filling applications forms Land more interviews with tailored applications
happy man
thunder iconActivate JobCopilot

Job Description - Inference Optimization Intern Performance Modeling


About the Institute of Foundation Models

 

The Institute of Foundation Models is dedicated to advancing the science and engineering of large-scale AI systems. Our researchers and engineers develop cutting-edge foundation models while pushing the limits of high-performance computing and efficient AI inference. By combining deep expertise in machine learning, systems engineering, and hardware optimization, we build scalable AI solutions that drive scientific discovery and real-world impact.

As part of the team, interns work alongside world-class researchers and performance engineers to optimize the execution of large-scale foundation models on next-generation NVIDIA GPU architectures. This internship provides hands-on experience in low-level GPU performance analysis, kernel optimization, and hardware-aware inference acceleration.

Key Responsibilities


This intensive internship offers a unique opportunity to contribute to the development of a simulator and profiling framework for foundation model inference on NVidia GPUs.

Responsibilities include:



  • Develop analytical performance models for GPU kernels and inference workloads.



  • Build and validate a simulator to estimate theoretical hardware performance limits.



  • Compare measured kernel performance against architectural peak throughput.



  • Identify performance bottlenecks in compute, memory, communication, and scheduling.



  • Analyze GPU execution using NVIDIA Nsight Systems and Nsight Compute.



  • Investigate PTX and SASS code generation to understand low-level execution behavior.



  • Collaborate with researchers and engineers to optimize inference kernels for transformer-based models.



  • Evaluate utilization of Tensor Cores, memory bandwidth, caches, and instruction pipelines.



  • Design profiling methodologies for Hopper and Blackwell architectures.



  • Document findings and provide actionable recommendations for performance improvements.



Academic Qualifications

Currently pursuing a degree in Computer Science, Computer Engineering, Electrical Engineering, Artificial Intelligence, High-Performance Computing, or a related quantitative discipline.

Preferred Qualifications




  • Experience with CUDA programming and GPU kernel development.



  • Understanding of NVIDIA GPU architecture and memory hierarchy.



  • Familiarity with performance profiling tools such as Nsight Systems and Nsight Compute.



  • Knowledge of PTX, SASS, and low-level GPU execution.



  • Experience optimizing CUDA kernels for throughput and latency.



  • Understanding of roofline analysis, performance modeling, and hardware utilization metrics.



  • Experience with deep learning frameworks such as PyTorch or TensorFlow.



  • Strong programming skills in C++, CUDA, and Python.



Desired Skills




  • Performance engineering mindset.



  • Strong analytical and debugging abilities.



  • Interest in AI systems, inference optimization, and hardware-software co-design.



  • Ability to work independently on research and engineering challenges.



  • Excellent written and verbal communication skills.



Original job Inference Optimization Intern Performance Modeling posted on GrabJobs ©. To flag any issues with this job please use the Report Job button on GrabJobs.
Apply Now
Share Job
Share Job

Auto-Apply to Inference Optimization Intern Jobs with your AI JobCopilot

thunder icon Auto-Apply with AI

Similar Inference Optimization Intern Jobs in the US

GrabJobs is the no1 job portal in the US, connecting you to thousands of jobs fast! Find the best jobs in the US, apply in 1 click and get a job today!

Mobile Apps

Copyright © 2026 Grabjobs Pte.Ltd. All Rights Reserved.