Site Reliability Engineer (SRE) II

Company : Huntington National Bank

Job Type : Full Time

United States

Number of Applicants

000+

Apply Now

Let AI Supercharge Your Job Hunt!

JobCopilot scans 500,000+ company career sites daily to find jobs for you

Never miss an opportunity Save hours by auto-filling applications forms Land more interviews with tailored applications

Activate JobCopilot

Job Description - Site Reliability Engineer (SRE) II

Description

As a Site Reliability Engineer (SRE) Level II, you will play a key role in maintaining the availability, scalability, and performance of critical infrastructure and services. You will be responsible for building and automating solutions that enhance system reliability and support continuous delivery. In this role, you will handle more complex operational tasks and incidents, provide mentorship to junior SREs, and collaborate with development teams to ensure systems are designed for reliability from the ground up.

Incident Management :

complex incidents, and ensure service uptime.
Lead troubleshooting efforts for high-impact production issues, providing detailed root cause analysis (RCA) and preventative measures.
Participate in on-call rotations, acting as an escalation point for Level 1 SREs during major incidents.

Automation & Infrastructure as Code (IaC):

Develop and maintain automation scripts and infrastructure using tools like Terraform, Ansible, or CloudFormation.
Implement automation solutions to eliminate manual tasks and improve system reliability, scalability, and performance.

Performance & Scalability:

Analyze system performance and recommend optimizations for scalability and reliability.
Support capacity planning efforts by monitoring system metrics, traffic
patterns, and usage trends to predict future resource needs.

System Design & Architecture:
Collaborate with software engineering teams to influence the design of new services and applications, ensuring they are scalable, reliable, and resilient from the start.
Contribute to architectural decisions, ensuring alignment with best practices in fault tolerance, redundancy, and recovery.

Monitoring & Observability:
Build and maintain robust monitoring, alerting, and observability solutions to proactively detect and resolve issues before they impact end users.
Optimize existing monitoring tools (e.g., Prometheus, Grafana, Datadog, Dynatrace) and build custom dashboards for better visibility into system health.

Security & Compliance:
Ensure systems and infrastructure are secure, compliant, and aligned with organizational policies and industry best practices.
Assist with vulnerability management, system patching, and implementing security measures to protect the integrity and availability of services.

Continuous Improvement:
Lead efforts to continuously improve operational processes, tools, and workflows.
Implement and enforce best practices in deployment, monitoring, and incident management to improve overall system reliability and reduce downtime.

Basic Qualification

Minimum 5 years of experience in site reliability engineering, DevOps, systems administration, or related roles.
Strong experience with Linux/Unix administration and proficiency in scripting (e.g., Python, Bash, Go).
Deep understanding of cloud platforms (AWS, GCP, Azure) and related services (EC2, S3, Lambda, Kubernetes, etc.).
- Experience with containerization and orchestration technologies like Docker and Kubernetes.
- Proficiency with monitoring and observability tools such as dynatrace, Prometheus, Grafana, Datadog, ELK Stack, or similar platforms.
- Strong understanding of networking fundamentals (DNS, HTTP, TCP/IP), load balancing, and CDNs.
- Experience with CI/CD tools (Jenkins, GitLab CI, CircleCI) and infrastructure automation (Terraform, Ansible, Puppet).
- Familiarity with distributed systems and microservices architecture.
- Excellent problem-solving and troubleshooting skills, especially in diagnosing production issues in high-scale environments.

Preferred:

Background in MLOps, data engineering, and/or cloud-native AI deployment.
Strong communication and documentation abilities
Knowledge of security best practices for AI and cloud infrastructure.
Contributions to open source AI/SRE projects or relevant technical communities
Proven track record of managing complex infrastructure, troubleshooting production issues, and optimizing system performance

Exempt Status: (Yes = not eligible for overtime pay) (No = eligible for overtime pay)

Yes

Workplace Type:

Office

Our Approach to Office Workplace Type

Certain positions outside our branch network may be eligible for a flexible work arrangement. We’re combining the best of both worlds: in-office and work from home. Our approach enables our teams to deepen connections, maintain a strong community, and do their best work. Remote roles will also have the opportunity to come together in our offices for moments that matter. Specific work arrangements will be provided by the hiring team.

Huntington is an Equal Opportunity Employer.

Tobacco-Free Hiring Practice: Visit Huntington's Career Web Site for more details.

Note to Agency Recruiters: Huntington Bank will not pay a fee for any placement resulting from the receipt of an unsolicited resume. All unsolicited resumes sent to any Huntington Bank colleagues, directly or indirectly, will be considered Huntington Bank property. Recruiting agencies must have a valid, written and fully executed Master Service Agreement and Statement of Work for consideration.

Original job Site Reliability Engineer (SRE) II posted on GrabJobs ©. To flag any issues with this job please use the Report Job button on GrabJobs.

Apply Now

Auto-Apply to Similar Jobs

Share Job

Get your Resume Reviewed for Free

Automate Job Applications for Similar Jobs

Auto-Apply to Site Reliability Engineer Jobs with your AI JobCopilot

Auto-Apply with AI

Similar Site Reliability Engineer Jobs in the US

Get your Resume Reviewed for Free

Email address

Why are you reporting this job?

I think it’s a discriminatory or offensive

I think it’s fraudulent or a scam

I think it’s trying to sell something unrelated to the job / it’s asking for money

I think it contains incorrect or broken information

Other

All Job Ads are subject to GrabJobs’s Terms of Service. We allow users to flag postings that may be in violation of those terms. Job Ads may also be flagged by GrabJobs moderation team. However, no moderation system is perfect, and flagging a posting does not ensure that it will be removed.

Setup your job alert:

Frequency

By activating job alerts, I agree to GrabJobs Terms & Privacy Policy. I can unsubscribe to job alerts anytime. Skip

Site Reliability Engineer (SRE) II

Job Description - Site Reliability Engineer (SRE) II

Description

Similar Site Reliability Engineer Jobs in the US

Mobile Apps