Site Reliability Engineer

Company : Right Advisors Private Limited

Job Type : Full Time

Gurugram, India

Job Description - Site Reliability Engineer

About the Role

About the Role
We are seeking a proactive and detail -oriented Site Reliability Engineer (SRE) with 3+ years
of experience to ensure high availability, reliability, and performance of production systems.
This role focuses on automation,

incident management, and cross -team
coordination to drive operational excellence.

Key Responsibilities

• Maintain reliable, scalable, and secure production environments.

• Implement and manage monitoring, alerting, and logging solutions.

• Contribute to defining and tracking SLIs/SLOs and support error budget practices.

• Automate operational tasks to improve efficiency and reduce manual effort.

• Perform troubleshooting and Root Cause Analysis (RCA) for production incidents.

• Optimize system performance, availability, and capacity.

• Maintain SOPs, and incident documentation in Confluence.

• Adhere to change management, deployment governance, and disaster recovery
standards.

• Support incident response for critical production services.

Collaboration & Tools

• Coordinate with external vendors and internal cross -functional teams.

• Work closely with Engineering, Product Owners, and Operations teams.

• Manage incidents and changes using ServiceNow & JIRA.

• Collaborate through Slack and structured communication channels.

Technical Skills
Systems & Clouds

• Strong knowledge of Windows and Linux/Unix systems

• Solid understanding of networking fundamentals (DNS, TCP/IP, Load Balancing,
Firewalls).

• Experience with at least one cloud platform (AWS, Azure, or GCP).

• Automation & CI/CD

• Proficiency in one scripting/programming language (Python, Go, Bash, PowerShell, or
Java).

• Understanding of CI/CD pipelines and automation practices.

Containers

• Hands -on experience with Docker and Kubernetes

• Experience with monitoring tools such as or Power BI.

• Ability to analyze logs, metrics, and traces for troubleshooting.

ITSM & Documentation

• Experience with ServiceNow & JIRA (incident/change/problem workflows)

• Working knowledge of Confluence for technical documentation and knowledge
management.

Additional Experience (Preferred)

• Background in DevOps, Cloud Engineering, or Platform Engineering

• Understanding of security best practices and compliance standards.

• Familiarity with AI -assisted engineering tools (Claude Code, Jellyfish, GitHub Copilot

• Exposure to large -scale or production -grade systems.

Soft Skills

• Strong analytical and troubleshooting mindset

• Excellent written and verbal communication skills

• Ownership driven and composed during high level severity incidents
Accessibility & Inclusion Statement

We are committed to creating an inclusive environment for all employees, including persons
with disabilities. Reasonable accommodations will be provided upon request.

Email address

Why are you reporting this job?

I think it’s a discriminatory or offensive

I think it’s fraudulent or a scam

I think it’s trying to sell something unrelated to the job / it’s asking for money

I think it contains incorrect or broken information

Other

All Job Ads are subject to GrabJobs’s Terms of Service. We allow users to flag postings that may be in violation of those terms. Job Ads may also be flagged by GrabJobs moderation team. However, no moderation system is perfect, and flagging a posting does not ensure that it will be removed.