Senior Software Engineer- AI Datacenter Orchestration

Salary :

$587 monthly

Company : Drivenets

Job Type : Full Time

Tel Aviv-Yafo, Israel

Number of Applicants

000+

Apply Now

Let AI Supercharge Your Job Hunt!

JobCopilot scans 500,000+ company career sites daily to find jobs for you

Never miss an opportunity Save hours by auto-filling applications forms Land more interviews with tailored applications

Activate JobCopilot

Job Description - Senior Software Engineer- AI Datacenter Orchestration

Location: Tel Aviv

#Hybrid

DriveNets is a leader in high-scale disaggregated networking solutions. Founded in 2015, DriveNets modernizes the way service providers, cloud providers and hyperscalers build networks. Supporting the largest network in the world, more than half of AT&T’s backbone traffic is running on DriveNets’ Network Cloud open disaggregated architecture. Raising $587 million in three funding rounds, DriveNets is disrupting the networking market from high-scale architecture to AI platforms, and is bringing onboard the most talented people. We are seeking people that want to make an impact on the world’s leading communication networks and are experienced in networking architecture or AI infrastructure solutions.

Responsibilities and Duties

• Design and build the profiled network infrastructure that teams run high-performance LLM serving services on in production.

• Build the data-path and memory-fabric infrastructure that gives teams the primitives to implement KV cache strategies — paged attention, prefix caching, eviction policies — and hit their efficiency and latency targets.

• Provision and profile the network fabric and cluster infrastructure that inference frameworks (vLLM, TGI, TensorRT-LLM, Triton) are deployed on across GPU clusters.

• Build the scheduling and network infrastructure that exposes the throughput primitives teams need to implement batching strategies (continuous batching, dynamic batching) under SLA constraints.

• Build the compute and memory-bandwidth infrastructure profiles that give teams the headroom to evaluate and apply quantization techniques (GPTQ, AWQ, FP8, INT8) with clear production tradeoffs.

• Build network-level observability infrastructure — TTFT, TPOT, tokens/sec, GPU utilization, cache hit rates — that teams instrument their inference services against.

• Design and build the transport layer (SSE, gRPC, WebSocket) that teams use to expose real-time inference APIs.

• Build the storage and network infrastructure — sharding, format conversion, runtime configuration — that model teams use to move checkpoints to production endpoints.

Technical Skills

• 5+ years of backend engineering, with 2+ years specifically in ML inference systems.

• Deep understanding of transformer attention mechanics as they relate to KV cache design.

• Hands-on experience with at least one major inference engine (vLLM, TGI, TRT-LLM, Triton).

• Strong Python skills; ability to read and modify inference engine internals; C++/CUDA familiarity.

• Experience with paged/virtual KV cache, prefix caching, speculative decoding, or disaggregated prefill/decode.

• Production experience with GPU clusters (A100/H100/H200) and CUDA memory management.

• Experience with container orchestration (Kubernetes) and GPU scheduling.

• Strong fundamentals in building observable, production-grade microservices: health checks, structured logging, distributed tracing, metrics.

Soft Skills

• Strong cross-functional collaboration — ability to work effectively with model research and platform teams.

• Ownership mindset: comfortable driving production tradeoffs and making decisions under uncertainty.

• Clear technical communication: able to explain complex systems to both engineering and non-engineering stakeholders.

Nice to Have / Advantage

• Experience with tensor parallelism (TP), pipeline parallelism (PP), or multi-node inference.

• Contributions to open-source inference projects (vLLM, SGLang, etc.).

• Familiarity with attention variants: GQA, MLA, sliding window, MoE routing.

• Experience with NVIDIA NIM or Triton Inference Server deployment at scale.

Original job Senior Software Engineer- AI Datacenter Orchestration posted on GrabJobs ©. To flag any issues with this job please use the Report Job button on GrabJobs.

Apply Now

Auto-Apply to Similar Jobs

Share Job

Get your Resume Reviewed for Free

Automate Job Applications for Similar Jobs

Auto-Apply to Senior Software Engineer Jobs with your AI JobCopilot

Auto-Apply with AI

Similar Senior Software Engineer Jobs in the US

Get your Resume Reviewed for Free

Email address

Why are you reporting this job?

I think it’s a discriminatory or offensive

I think it’s fraudulent or a scam

I think it’s trying to sell something unrelated to the job / it’s asking for money

I think it contains incorrect or broken information

Other

All Job Ads are subject to GrabJobs’s Terms of Service. We allow users to flag postings that may be in violation of those terms. Job Ads may also be flagged by GrabJobs moderation team. However, no moderation system is perfect, and flagging a posting does not ensure that it will be removed.

Setup your job alert:

Frequency

By activating job alerts, I agree to GrabJobs Terms & Privacy Policy. I can unsubscribe to job alerts anytime. Skip

Senior Software Engineer- AI Datacenter Orchestration

Job Description - Senior Software Engineer- AI Datacenter Orchestration

Similar Senior Software Engineer Jobs in the US

Mobile Apps