Site Reliability Engineer (Hosted Infra) - Platform

Company : Elastic Nv

Job Type : Full Time

United States

Number of Applicants

000+

Apply Now

Let AI Supercharge Your Job Hunt!

JobCopilot scans 500,000+ company career sites daily to find jobs for you

Never miss an opportunity Save hours by auto-filling applications forms Land more interviews with tailored applications

Activate JobCopilot

Job Description - Site Reliability Engineer (Hosted Infra) - Platform

Elastic, the Search AI Company, enables everyone to find the answers they need in real time, using all their data, at scale — unleashing the potential of businesses and people. The Elastic Search AI Platform, used by more than 50% of the Fortune 500, brings together the precision of search and the intelligence of AI to enable everyone to accelerate the results that matter. By taking advantage of all structured and unstructured data — securing and protecting private information more effectively — Elastic’s complete, cloud-based solutions for search, security, and observability help organizations deliver on the promise of AI.

What is the role

We are Cloud Infrastructure SREs that integrate, scale, and evolve multi-cloud infrastructure across 4 Cloud Service Providers, 70+ globally distributed regions, and tens of thousands of hosts to power Elastic Cloud. We tackle hard problems at scale through automation, Infrastructure as Code (IaC), configuration management, and purpose-built software that eliminates toil and improves reliability.

We're also a team that grows people as well as systems. If that challenge genuinely excites you, we'd love to hear from you.

What you will be doing

Engineering software to automate large-scale systems — building internal tools and services, not just running scripts.

Optimizing the reliability and lifecycle of hosts across multiple cloud providers.

Strengthening our observability posture — crafting alerting and monitoring systems that drive incident prevention over incident response.

Scaling global infrastructure and evolving the infrastructure management processes to meet growing demand.

Contributing to code reviews, sharing your work, planning what we need to do next, and both mentoring and being mentored by teammates.

Being part of a balanced SRE on-call rotation: responding to incidents, improving runbooks, participating in postmortems, and championing reliability improvements.

What you bring

Experience building software with Golang. You are also comfortable reviewing others' code and offering constructive feedback.

Production experience operating large-scale cloud compute (hundreds of hosts or more) via automated workflows.

Deep experience with Linux systems — you are at home in the terminal debugging at the OS level.

Proficiency working with containerized workloads in production.

A customer-first, systems-thinking approach to operational problems — you care about root causes, not just symptoms.

Comfortable working across time zones in both real-time and asynchronous contexts.

You contribute clear and maintainable documentation such as software designs, runbooks, architecture diagrams/decisions, postmortems, etc...

You communicate project status regularly and clearly, flag blockers early, and follow through on action items.

A sensible approach to AI integration — identifying where AI tools genuinely reduce operational burden and embedding them into workflows without adding complexity.

Bonus Points

Production experience with any of: Terraform, Puppet, Ansible, Argo CD, Argo Workflows, CUE, Docker, Kubernetes, Ubuntu, or Ubuntu Live Patch.

Experience being on-call during incidents and using observability tools (e.g. Elastic Stack, Graphite, Prometheus, Influx) to diagnose issues, quantify impact, and confirm mitigations.

Hands-on experience engineering solutions with the Elastic Stack.

Original job Site Reliability Engineer (Hosted Infra) - Platform posted on GrabJobs ©. To flag any issues with this job please use the Report Job button on GrabJobs.

Apply Now

Auto-Apply to Similar Jobs

Share Job

Get your Resume Reviewed for Free

Automate Job Applications for Similar Jobs

Auto-Apply to Site Reliability Engineer Jobs with your AI JobCopilot

Auto-Apply with AI

Similar Site Reliability Engineer Jobs in the US

Get your Resume Reviewed for Free

Email address

Why are you reporting this job?

I think it’s a discriminatory or offensive

I think it’s fraudulent or a scam

I think it’s trying to sell something unrelated to the job / it’s asking for money

I think it contains incorrect or broken information

Other

All Job Ads are subject to GrabJobs’s Terms of Service. We allow users to flag postings that may be in violation of those terms. Job Ads may also be flagged by GrabJobs moderation team. However, no moderation system is perfect, and flagging a posting does not ensure that it will be removed.

Setup your job alert:

Frequency

By activating job alerts, I agree to GrabJobs Terms & Privacy Policy. I can unsubscribe to job alerts anytime. Skip