Number of Applicants
:000+
Let AI Supercharge Your Job Hunt!
JobCopilot scans 500,000+ company career sites daily to find jobs for you
About the Company
DriveNets is a leader in large-scale networking solutions for AI infrastructure and service providers. The company's disaggregated networking architecture transforms the economics of large-scale infrastructures while maximizing performance, utilization, and operational efficiency. Its high-performance AI fabric maximizes GPU utilization and accelerates deployments by optimizing the AI stack end-to-end, resulting in higher tokens-per-second and lower cost-per-token. DriveNets' solutions power production networks for global tier-1 operators like AT&T and Comcast, and scale multi-vendor AI infrastructures at foundation model labs, NeoClouds, and enterprises.
Responsibilities
- Design, build, and operate the internal engineering platform powering DriveNets' build, test, deployment, and security validation workflows at scale
- Write and maintain production-grade Python and shell tooling that drives platform automation — this is a hands-on coding role, not just pipeline configuration
- Architect and manage hybrid cloud/on-prem execution infrastructure, including large-scale Kubernetes runner pools across multiple AWS regions
- Own and evolve CI/CD pipelines at scale using GitHub Actions, including reusable workflows, ARC-based runner orchestration, and build caching strategies (BuildKit, sccache, Valkey)
- Operate and tune DinD environments (Sysbox, EBS/NVMe, overlay storage, MTU/networking) for build, test, and release workloads
- Connect and manage self-hosted and on-prem runners, routing physical device (wbox) test jobs by site and device type
- Implement DevSecOps controls including least-privilege IAM, OIDC, isolated runner groups, container signing, and automated security scans
- Drive platform observability, cost optimization, and reliability improvements across the engineering infrastructure
- Collaborate cross-functionally with hundreds of engineers to improve engineering velocity and release confidence
- Take end-to-end ownership of complex infrastructure problems and drive them to resolution
Technical Skills
- 5+ years of hands-on DevOps experience with a strong software development background — prior development experience is a must
- B.Sc. in Computer Science or equivalent practical experience
- Strong programming skills in Python (or a similar high-level language); ability to write and own production tooling
- Proven experience designing and building scalable systems, automation frameworks, and infrastructure as code using Terraform and Helm
- Solid understanding of Linux, containers (Docker), and Git-based workflows
- Hands-on experience with CI/CD at scale using GitHub Actions or similar — including reusable actions, workflow design, and automation frameworks
- Deep experience with hybrid cloud infrastructure (AWS and on-prem), including EKS, ARC, Karpenter, ECR, S3, Direct Connect, VPC endpoints, IAM/OIDC, and Secrets Manager
- Experience operating spot and on-demand runner pools for builds, DinD tests, releases, and security scans across multiple AWS regions
- Experience with DinD environments (Sysbox, EBS/NVMe, memory limits, overlay storage, MTU/networking) and build caching (BuildKit, sccache, Valkey)
- Experience connecting on-prem/self-hosted runners and routing physical device (wbox) test jobs by site and device type
- Experience implementing DevSecOps controls and improving platform observability, cost efficiency, and reliability
- Platform & tooling familiarity: Kubernetes (EKS, on-prem) · GitHub Actions · ARC · Karpenter · Terraform · Helm · Docker/DinD · Sysbox · containerd · BuildKit · ECR · S3 · ElastiCache (Valkey) · sccache · Direct Connect · VPC endpoints · IAM/OIDC · Secrets Manager · self-hosted runners
Soft Skills
- Strong system-level thinking and troubleshooting skills; able to diagnose and resolve complex infrastructure issues independently
- Takes end-to-end ownership and drives problems to resolution without hand-holding
- Excellent communication and cross-team collaboration skills; comfortable working alongside large engineering organizations
Nice to Have / Advantage
- Experience with Jenkins
- Familiarity with GitHub merge queue
- Experience with MinIO or on-prem S3 caching
- Hardware-in-the-loop CI experience
- MTU/VPC networking tuning expertise
- Monorepo CI optimization experience
Auto-Apply to Senior DevOps Engineer Jobs with your AI JobCopilot
Copyright © 2026 Grabjobs Pte.Ltd. All Rights Reserved.