N

Site Reliability Engineer (SRE) / Observability Engineer

icon briefcase Job Type : Full Time

Number of Applicants

 : 

000+

Click to reveal the number of candidates who applied for this job.
icon loader
Apply Now
icon loader Apply Now

Let AI Supercharge Your Job Hunt!

JobCopilot scans 500,000+ company career sites daily to find jobs for you

Never miss an opportunity Save hours by auto-filling applications forms Land more interviews with tailored applications
happy man
thunder iconActivate JobCopilot

Job Description - Site Reliability Engineer (SRE) / Observability Engineer

About the opportunity
We are hiring on behalf of a well -established global IT consulting and implementation firm with offices across North America, Europe, and India (HITEC City, Hyderabad). The organisation delivers technology solutions across Cloud, DevOps, SAP, and AI for enterprise clients globally and has a strong people -first, learning -oriented culture.

Role overview
We are looking for a Site Reliability Engineer with a strong Observability specialisation to drive service reliability, reduce operational toil, and build best -in -class monitoring and alerting infrastructure. The ideal candidate brings deep Grafana expertise and will take ownership of SLO/SLA definition, distributed system visibility, and driving the shift from reactive to proactive operations.

Key responsibilities
• Define, track, and report on Service Level Objectives (SLOs), Service Level Indicators (SLIs), and error budgets across platform services
• Build, maintain, and optimise observability infrastructure using Grafana, Prometheus, Loki, Tempo, and related open -source tooling
• Develop dashboards and alerting rules that provide actionable, low -noise insights for engineering and operations teams
• Lead blameless post -incident reviews (PIRs) and drive systemic reliability improvements from learnings
• Partner with engineering teams to instrument applications with distributed tracing, structured logging, and custom metrics
• Reduce operational toil through automation — scripting runbooks, auto -remediation workflows, and self -healing infrastructure
• Define on -call practices, escalation policies, and runbooks; contribute to a sustainable on -call culture
• Evaluate and implement new observability tooling as the stack evolves (e.g., OpenTelemetry, Jaeger, VictoriaMetrics)

Required skills & experience
• 8+ years of combined SRE / DevOps / Platform Engineering experience
• Strong hands -on expertise with Grafana — dashboards, alerting, data sources
• Proficiency in Prometheus — PromQL, exporters, alertmanager
• Experience with log aggregation using Loki, ELK stack, or equivalent
• Solid understanding of distributed systems principles, microservices architecture, and container orchestration (Kubernetes)
• Proficiency in Python, Go, or Bash for automation and tooling
• Strong analytical thinking for root cause analysis and capacity planning

Good to have
• Hands -on experience with OpenTelemetry instrumentation
• Exposure to Grafana OnCall, Grafana Incident, or PagerDuty for incident management
• Familiarity with eBPF -based observability tools (Cilium, Parca)
• Azure or AWS certifications

What's on offer
• End -to -end ownership of observability — not just maintaining dashboards
• Hybrid work flexibility from HITEC City, Hyderabad
• Exposure to global -scale distributed systems for international clients
• Certification reimbursement and structured learning pathways

Location: Hyderabad (Hybrid)
Experience: 8+ years
Employment type: Full -time
Specialisation: Observability – Grafana, Prometheus, Loki stack

Original job Site Reliability Engineer (SRE) / Observability Engineer posted on GrabJobs ©. To flag any issues with this job please use the Report Job button on GrabJobs.
Apply Now
Share Job
Share Job

About the Company

N Human Resources

nHRMS is the ultimate executive search partner for your business. Find the perfect fit for your organization today.

Read more about the company

Auto-Apply to Site Reliability Engineer Jobs with your AI JobCopilot

thunder icon Auto-Apply with AI

Similar Site Reliability Engineer Jobs in India

GrabJobs is the no1 job portal in India, connecting you to thousands of jobs fast! Find the best jobs in India, apply in 1 click and get a job today!

Mobile Apps

Copyright © 2026 Grabjobs Pte.Ltd. All Rights Reserved.