Senior SW Engineer – AI Infrastructure & Optimization
We are looking for a Senior Software Engineer to help build and optimize large-scale, high-performance GenAI infrastructure and inference systems on Kubernetes.
As AI workloads increasingly move toward Kubernetes-native infrastructure, we are building systems that support distributed inference, performance optimization, reliability, observability, and production-grade deployment at scale.
This role is ideal for an engineer who can reason deeply about systems, performance, tradeoffs, and reliability, and who is comfortable owning difficult technical decisions end-to-end.
You will work across inference serving, distributed systems, optimization, and Kubernetes-native AI infrastructure.
What You’ll Do
Build and optimize high-performance Kubernetes-native GenAI inference systems
Work with modern inference stacks such as vLLM, SGLang, TensorRT-LLM, and related tooling
Work with Kubernetes-native distributed LLM inference frameworks such as llm-d and NVIDIA Dynamo
Design and implement optimization algorithms and performance improvements
Improve reliability, observability, deployment, and operational maturity of AI systems
Make architectural decisions and take ownership of technical outcomes
Collaborate with a small, senior engineering team focused on performance and production quality
Required Qualifications
Minimum 5 years of experience as a Software Engineer, with strong software engineering and system design skills.
Programming experience in Go and Python
Hands-on experience with the Kubernetes ecosystem, including Operators, service meshes, GitOps, Gateway API, and OpenTelemetry
Experience with cloud platforms
Strong understanding of optimization algorithms and performance engineering
Ability to independently drive technical initiatives from concept to production
Strong systems thinking and debugging skills
Comfort operating in environments with high autonomy and responsibility
Nice to Have
Experience with modern LLM inference frameworks such as vLLM, SGLang, or TensorRT-LLM
Experience with distributed LLM inference frameworks such as llm-d or NVIDIA Dynamo
Contributions to open-source Kubernetes or ML infrastructure projects
GPU performance optimization and profiling experience
Familiarity with CUDA, NCCL, or Triton kernels
Experience running GenAI systems at scale in production
All Job Ads are subject to GrabJobs’s Terms of Service. We allow users to flag postings that may be in violation of those terms. Job Ads may also be flagged by GrabJobs moderation team. However, no moderation system is perfect, and flagging a posting does not ensure that it will be removed.
Be the first to receive the latest Others Full-Time Jobs in Poland.
Setup your job alert:
By activating job alerts, I agree to GrabJobs Terms & Privacy Policy. I can unsubscribe to job alerts anytime.
Skip
GrabJobs is the no1 job portal in Poland, connecting you to thousands of jobs fast!
Find the best jobs in Poland, apply in 1 click and get a job today!