J

Senior ML Engineer (Token Factory)

icon building Company : Jobgether
icon briefcase Job Type : Full Time

Number of Applicants

 : 

000+

Click to reveal the number of candidates who applied for this job.
icon loader
Apply Now
icon loader Apply Now

Let AI Supercharge Your Job Hunt!

JobCopilot scans 500,000+ company career sites daily to find jobs for you

Never miss an opportunity Save hours by auto-filling applications forms Land more interviews with tailored applications
happy man
thunder iconActivate JobCopilot

Job Description - Senior ML Engineer (Token Factory)


This position is listed on behalf of a partner company, who manages all applications and next steps. Our partner is looking for a Senior Machine Learning Engineer (Token Factory) based in Romania.


This role sits at the intersection of large-scale AI systems and high-performance infrastructure, focusing on optimizing how foundation models are trained and served at scale.
You will contribute to a cutting-edge inference and fine-tuning platform designed to push modern LLMs to their performance limits across massive GPU fleets.
The work directly impacts throughput, latency, and cost efficiency for next-generation AI workloads used in production environments.
You will collaborate with highly specialized engineers across ML, systems, and infrastructure domains in a fast-moving, research-driven environment.
The role combines deep ML expertise with systems-level engineering, requiring strong understanding of both model architecture and hardware behavior.
You will help design and improve critical components such as inference engines, training pipelines, and GPU optimization strategies.


Accountabilities:


  • Drive inference optimization efforts by identifying bottlenecks and implementing performance improvements across diverse LLM architectures, improving throughput and reducing latency and cost per token.

  • Contribute to the design and evolution of inference engines, including techniques such as speculative decoding, KV-cache optimization, and support for dense and MoE models.

  • Develop and productionize low-precision training and inference pipelines (e.g., FP8, MXFP4) to maximize efficiency on large GPU clusters.

  • Profile and analyze GPU workloads using modern tooling to identify performance constraints and guide architectural improvements.

  • Collaborate on scalable distributed training and inference systems, including sharding strategies, custom kernels, and hardware-aware optimizations.

  • Contribute to engineering best practices including testing, CI/CD, and maintainable production-grade ML systems.


Requirements:



  • Strong understanding of machine learning fundamentals, particularly transformer architectures and large language models.

  • Hands-on experience profiling and optimizing GPU workloads using tools such as Nsight or PyTorch Profiler.

  • Deep knowledge of GPU architecture, including memory hierarchy and compute vs. memory trade-offs.

  • Familiarity with key LLM concepts such as attention mechanisms, RoPE, KV-cache, Flash Attention, and quantization techniques.

  • Experience with large-scale deep learning training, including distributed systems, sharding strategies, and custom kernel development.

  • Strong software engineering skills, with advanced proficiency in Python and modern ML frameworks.

  • Solid understanding of software engineering practices such as version control, CI/CD pipelines, and unit testing.

  • Strong communication skills with the ability to collaborate effectively in highly technical, cross-functional teams.


Benefits:



  • Competitive compensation package

  • Strong career development and continuous learning opportunities

  • Flexible work environment with high autonomy and ownership

  • Collaborative, innovation-driven engineering culture

  • Opportunity to work on frontier AI systems at massive scale

  • International, highly skilled, and diverse team environment

How Jobgether works:

We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team.

We appreciate your interest and wish you the best!


 

Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time.

 

 

#LI-CL1
We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.
Original job Senior ML Engineer (Token Factory) posted on GrabJobs ©. To flag any issues with this job please use the Report Job button on GrabJobs.
Apply Now
Share Job
Share Job

Auto-Apply to Senior Machine Learning Engineer Jobs with your AI JobCopilot

thunder icon Auto-Apply with AI

Similar Senior Machine Learning Engineer Jobs in Romania

GrabJobs is the no1 job portal in Romania, connecting you to thousands of jobs fast! Find the best jobs in Romania, apply in 1 click and get a job today!

Mobile Apps

Copyright © 2026 Grabjobs Pte.Ltd. All Rights Reserved.