Location: West Hollywood / Los Angeles, CA
Work Model: On-site (5 days per week)
Employment Type: Full-Time
Compensation: $200,000–$300,000+ USD (depending on experience and seniority), plus a competitive sign-on bonus.
Applicants must be legally authorized to work in the United States. Visa sponsorship is not available for this role.
About the Opportunity
Our client is a well-funded, early-stage AI company building a next-generation intelligence platform for high-stakes, real-world decision making.
The platform ingests and fuses data from satellite feeds, autonomous sensors, logistics networks, enterprise systems, and open-source intelligence (OSINT) to power production AI/ML workloads, knowledge graphs, and intelligent decision-making systems.
This is not a traditional SaaS, DevOps, or chatbot company. The engineering team is building production AI infrastructure where reliability, scalability, security, and developer productivity are mission-critical.
We're looking for a Senior, Lead, or Principal Platform Engineer who enjoys building platforms—not simply maintaining them. You'll own the cloud infrastructure, Kubernetes platform, CI/CD and GitOps workflows, infrastructure automation, and internal developer platform that enables engineering teams to build and deploy production AI systems at scale.
This is a highly collaborative, hands-on engineering role with significant ownership and influence over the platform architecture.
The Role
As a Platform Engineer, you'll design, build, and operate the infrastructure that powers complex AI/ML workloads, while creating the internal tooling and platform capabilities that help software engineers move faster and more reliably.
The ideal candidate has a strong software engineering foundation, deep cloud infrastructure expertise, and experience owning production Kubernetes environments from design through day-to-day operations.
Key Responsibilities
Platform Engineering
- Design, build, and operate scalable cloud infrastructure supporting production AI/ML workloads.
- Own Kubernetes infrastructure, including architecture, networking, security, upgrades, scaling, and operational reliability.
- Build and evolve an internal developer platform that improves engineering productivity and deployment velocity.
- Develop self-service infrastructure and automation that enables engineering teams to ship software quickly and safely.
- Continuously improve developer experience through platform engineering best practices.
Cloud Infrastructure & DevOps
- Design and implement modern CI/CD and GitOps workflows for production environments.
- Build reusable Infrastructure-as-Code solutions using Terraform and related tooling.
- Architect highly available, resilient, and cost-efficient cloud infrastructure.
- Drive adoption of containerization, Kubernetes, and cloud-native infrastructure across engineering teams.
- Support AI-powered development workflows using tools such as Claude Code, Cursor, GitHub Copilot, or similar technologies.
AI Infrastructure
- Build and optimize infrastructure supporting GPU-accelerated machine learning workloads.
- Improve GPU provisioning, scheduling, utilization, and resource management.
- Support scalable infrastructure for model training, inference, and AI services deployed in production.
- Partner closely with AI engineers to optimize platform performance and reliability.
Reliability & Operations
- Lead the investigation and resolution of complex production incidents across cloud infrastructure, Kubernetes, networking, and applications.
- Perform root-cause analysis and implement long-term improvements that increase reliability.
- Build comprehensive monitoring, alerting, logging, and observability solutions.
- Drive platform reliability, performance optimization, and operational excellence.
Collaboration & Architecture
- Partner with software engineers, AI engineers, security teams, and technical leadership on platform architecture decisions.
- Produce technical design documentation for major infrastructure initiatives.
- Champion engineering best practices around automation, scalability, security, testing, and reliability.
- Evaluate emerging technologies that improve infrastructure capabilities and developer productivity.
Required Qualifications
- Bachelor's degree in Computer Science, Software Engineering, Information Technology, or a related technical discipline (Master's preferred).
- 5+ years of experience building and operating production cloud infrastructure, Platform Engineering, DevOps, or Site Reliability Engineering (SRE) environments.
- Strong software engineering foundation with experience building automation, tooling, services, or developer platforms using Python, Go, Bash, or similar languages.
- Demonstrated ownership of production Kubernetes clusters, including architecture, networking, upgrades, scaling, and operational support.
- Hands-on experience designing and building Infrastructure-as-Code solutions using Terraform, including authoring reusable modules.
- Strong experience designing and building CI/CD and GitOps pipelines—not simply maintaining existing pipelines.
- Deep experience with Google Cloud Platform (GCP) and/or AWS.
- Strong understanding of containerization technologies including Docker and Kubernetes.
- Experience building and operating production-scale distributed systems.
- Strong troubleshooting skills across cloud infrastructure, Kubernetes, networking, and applications.
- Experience with observability platforms such as Prometheus, Grafana, Datadog, ELK, or equivalent.
- Excellent communication and collaboration skills.
Preferred Qualifications
Experience with one or more of the following is highly desirable:
- AI/ML infrastructure and GPU-accelerated workloads.
- NVIDIA GPU infrastructure and CUDA environments.
- Internal developer platforms and self-service infrastructure.
- GitOps methodologies.
- AI-native development tools such as Claude Code, Cursor, GitHub Copilot, or Codex.
- Security-focused environments including DevSecOps practices.
- Air-gapped, sovereign, or highly regulated deployment environments.
- Defense, aerospace, government, or other mission-critical industries.
- FedRAMP, ITAR, CMMC, or similar compliance frameworks.
- Serverless architectures and distributed systems.
What We're Looking For
Successful candidates will demonstrate:
- A platform engineering mindset with experience designing, building, and owning infrastructure—not simply maintaining existing environments.
- A strong software engineering foundation and passion for automation.
- Experience building platforms and internal tooling that improve developer productivity.
- Excellent systems thinking across cloud infrastructure, Kubernetes, networking, security, and distributed systems.
- A high level of ownership and comfort working in fast-moving environments with significant technical responsibility.
- A pragmatic approach to balancing reliability, scalability, security, and developer experience.
Compensation & Benefits
- Base salary: $200,000–$300,000+, depending on experience and seniority.
- Competitive sign-on bonus.
- Comprehensive benefits package.
- Opportunity to join a well-funded, high-growth AI company at an early stage with significant technical ownership.
- Long-term career growth with opportunities to take on broader platform and infrastructure leadership responsibilities as the organization continues to scale.
Why Join?
- Build production infrastructure powering real-world AI systems—not internal IT or traditional enterprise DevOps.
- Own the Kubernetes platform, developer experience, and cloud infrastructure that enables AI engineers to move faster.
- Work alongside a highly technical engineering team solving challenging platform and infrastructure problems.
- Support GPU-accelerated AI/ML workloads deployed in production.
- Help shape the technical foundation of a rapidly growing AI company where engineering quality, ownership, and innovation are highly valued.
If you're passionate about Platform Engineering, cloud infrastructure, Kubernetes, automation, and building the systems that power next-generation AI applications, we'd love to hear from you.
- Competitive base salary of $200,000–$300,000+ USD (depending on experience and seniority)
- Competitive sign-on bonus
- Comprehensive benefits package
- Significant technical ownership
- The opportunity to join a well-funded, early-stage AI company building next-generation AI infrastructure.
Candidates located anywhere in the U.S. are encouraged to apply. The company offers a competitive sign-on bonus for successful hires. Please note that relocation assistance is not provided.