Principal Site Reliability Engineer (SRE)

Company : Symmetrio

Job Type : Full Time

United States

Number of Applicants

000+

Apply Now

Let AI Supercharge Your Job Hunt!

JobCopilot scans 500,000+ company career sites daily to find jobs for you

Never miss an opportunity Save hours by auto-filling applications forms Land more interviews with tailored applications

Activate JobCopilot

Job Description - Principal Site Reliability Engineer (SRE)

Description

Symmetrio is recruiting a Principal Site Reliability Engineer (SRE) for our customer, a rapidly growing healthcare technology organization focused on advanced healthcare technology solutions.

This individual will play a critical role in ensuring the reliability, scalability, security, and performance of a mission-critical SaaS platform supporting healthcare providers across the United States. The ideal candidate will possess a unique blend of cloud infrastructure expertise, application troubleshooting experience, production operations leadership, and customer-facing technical problem-solving skills.

The ideal candidate will be equally comfortable investigating application-level issues, troubleshooting AWS networking and infrastructure, leading production incident response efforts, and collaborating with development teams to improve operational excellence.

Responsibilities

Serve as the primary technical owner for production reliability across U.S. customer environments.
Investigate and resolve complex issues spanning web applications, APIs, backend services, data pipelines, cloud infrastructure, and customer integrations.
Lead production incident response efforts, coordinating cross-functional teams to restore service and minimize customer impact.
Perform root cause analysis and drive corrective actions that improve long-term system stability and resilience.
Partner with software engineering and platform teams to identify recurring reliability risks and implement sustainable solutions.
Design, configure, and validate secure customer connectivity solutions including Site-to-Site VPNs, Transit Gateway integrations, routing configurations, and secure network paths.
Support customer onboarding initiatives by troubleshooting connectivity challenges and ensuring consistent implementation processes.
Enhance platform observability through improvements in monitoring, logging, alerting, tracing, and operational dashboards.
Contribute to CI/CD, infrastructure automation, and deployment processes that improve release safety and operational consistency.
Develop operational tooling that supports incident response, troubleshooting, onboarding, and system monitoring activities.
Collaborate with engineering leadership to improve cloud architecture, scalability, security, and operational readiness.
Partner with customer-facing teams to communicate technical issues, remediation plans, and reliability improvements in a clear and effective manner.
Support compliance, security, and risk management initiatives within highly regulated healthcare environments.

Requirements

6+ years of hands-on experience supporting and managing AWS-based production environments.
4+ years of experience supporting web applications and backend services (Python/Django experience strongly preferred).
Experience with AWS networking technologies including VPCs, Site-to-Site VPNs, Transit Gateways, routing, NAT gateways, and security groups.
Strong experience with Terraform and infrastructure-as-code deployment practices.
Experience with containerized environments including ECS, Fargate, Kubernetes, or similar technologies.
Experience building and supporting CI/CD pipelines and release automation processes.
Familiarity with monitoring and observability platforms such as Datadog, CloudWatch, Sentry, Grafana, or similar tools.
Experience leading production incidents, outage management, and root cause analysis initiatives.
Exposure to Windows Server environments, Active Directory, Kerberos, and enterprise infrastructure concepts is preferred.
Healthcare technology, healthcare SaaS, clinical software, or other regulated industry experience is highly preferred.
Bachelor’s degree in Computer Science, Engineering, Information Technology, or a related technical field preferred.

Benefits

Health Care Plan (Medical, Dental & Vision)
Retirement Plan (401k, IRA)
Paid Time Off (Vacation, Sick & Public Holidays)

Original job Principal Site Reliability Engineer (SRE) posted on GrabJobs ©. To flag any issues with this job please use the Report Job button on GrabJobs.

Apply Now

Auto-Apply to Similar Jobs

Share Job

Get your Resume Reviewed for Free

Automate Job Applications for Similar Jobs

Auto-Apply to Principal Site Reliability Engineer Jobs with your AI JobCopilot

Auto-Apply with AI

Similar Principal Site Reliability Engineer Jobs in the US

Get your Resume Reviewed for Free

Email address

Why are you reporting this job?

I think it’s a discriminatory or offensive

I think it’s fraudulent or a scam

I think it’s trying to sell something unrelated to the job / it’s asking for money

I think it contains incorrect or broken information

Other

All Job Ads are subject to GrabJobs’s Terms of Service. We allow users to flag postings that may be in violation of those terms. Job Ads may also be flagged by GrabJobs moderation team. However, no moderation system is perfect, and flagging a posting does not ensure that it will be removed.

Setup your job alert:

Frequency

By activating job alerts, I agree to GrabJobs Terms & Privacy Policy. I can unsubscribe to job alerts anytime. Skip

Principal Site Reliability Engineer (SRE)

Job Description - Principal Site Reliability Engineer (SRE)

Responsibilities

Similar Principal Site Reliability Engineer Jobs in the US

Mobile Apps