P

Member of Technical Staff, Infrastructure

icon briefcase Job Type : Full Time

Number of Applicants

 : 

000+

Click to reveal the number of candidates who applied for this job.
icon loader
Apply Now
icon loader Apply Now

Let AI Supercharge Your Job Hunt!

JobCopilot scans 500,000+ company career sites daily to find jobs for you

Never miss an opportunity Save hours by auto-filling applications forms Land more interviews with tailored applications
happy man
thunder iconActivate JobCopilot

Job Description - Member of Technical Staff, Infrastructure

Member of Technical Staff, Infrastructure

Overview

Physical Superintelligence is a stealth startup with roots at Google, NVIDIA, Harvard, Meta, MIT, Oxford, Johns Hopkins, Cambridge, and the Perimeter Institute building AI systems to discover new physics at scale. We are seeking engineers to build platform infrastructure at the intersection of computational science, AI systems, and software engineering.

Our mission is to discover and commercialize transformative physics breakthroughs at scale with artificial superintelligence, safely, verifiably, and for broad public benefit.

The last century's golden age of physics gave us transistors, lasers, and nuclear energy. We believe artificial superintelligence will unlock the next one. We're creating the infrastructure to industrialize scientific discovery and usher in this new era.

We have one product: new physics, at scale.

Role and Responsibilities

  • Own the full infrastructure stack end-to-end, from cloud foundations through CI/CD pipelines to production deployments. Build and operate multi-cloud infrastructure for our AI platform across GCP, AWS, and adjacent providers. Establish the infrastructure-as-code discipline at PSI: choose the tooling, design the modules, and make every research workflow, training job, and customer-facing AI product deployable through code.

  • Design and run the release engineering pipeline that ships code from commit to production. Every change flows through automated tests, security scans, and progressive rollouts. Fast, safe deploys are the default; long manual release cycles are not.

  • Operate the production infrastructure that powers our AI platform at scale: the paid API, model training jobs for our proprietary physics LLM, agentic research workflows, and customer deployments. Define and meet SLOs, build observability and alerting, schedule GPU and CPU capacity, lead incident response.

  • Be the leverage layer for the rest of engineering. Platform, product, security, and research engineers all depend on you for reliable cloud primitives, fast deploys, and visible production behavior. Write tools they use, not tickets they wait on.

What We're Looking For

  • Eight or more years operating cloud infrastructure in production at companies known for engineering rigor (e.g., Stripe, Cloudflare, Datadog, Snowflake, Databricks, Google, Netflix, or comparable), at multi-cloud scale. You have written code and shipped infrastructure that paying customers, internal teams, or large user bases depend on every day.

  • Deep fluency with infrastructure as code (Terraform, Pulumi, or comparable), CI/CD systems, Kubernetes, and major cloud platforms (GCP and AWS at minimum). You have built and operated multi-cloud production deployments end-to-end, from initial cloud setup through to release pipelines.

  • Machine learning and training-workload operations experience: GPU scheduling, distributed training infrastructure, model-serving pipelines, observability for ML systems. You have run production training jobs and shipped served-model surfaces.

  • Operational excellence and on-call discipline. You have led incidents, written runbooks, reduced toil with code, and built systems that scale without bureaucracy. You favor self-service abstractions over tickets and visibility over heroics.

Nice to Have

  • Built CI/CD or release engineering pipelines from scratch at a fast-growing company.

  • Hands-on with model serving infrastructure such as vLLM, Triton, or comparable.

  • Production observability with OpenTelemetry, Prometheus, Grafana, or comparable.

  • Background in scientific computing, HPC, or research compute environments.

How We Work

We are engineering-led. Engineers own problems end-to-end, from spec to ship to on-call. We write contracts before logic, test against real systems instead of mocks, and favor simple designs that ship over clever ones that do not. Our development process is AI-native: engineers work with agentic coding tools daily, write specs that are legible to humans and agents alike, and lead with leverage.

Location and Compensation

This is an in-person role based in Boston or San Francisco. We offer competitive compensation including salary, benefits, and meaningful early-stage equity. We evaluate on technical breadth, systems thinking, scientific curiosity, and shipping velocity. We are an equal opportunity employer and value diverse perspectives in building platforms for AI-driven discovery.

Original job Member of Technical Staff, Infrastructure posted on GrabJobs ©. To flag any issues with this job please use the Report Job button on GrabJobs.
Apply Now
Share Job
Share Job

Auto-Apply to Member of Technical Staff Jobs with your AI JobCopilot

thunder icon Auto-Apply with AI

Similar Member of Technical Staff Jobs in the US

GrabJobs is the no1 job portal in the US, connecting you to thousands of jobs fast! Find the best jobs in the US, apply in 1 click and get a job today!

Mobile Apps

Copyright © 2026 Grabjobs Pte.Ltd. All Rights Reserved.