A

Site Reliability Engineer II

icon briefcase Job Type : Full Time

Number of Applicants

 : 

000+

Click to reveal the number of candidates who applied for this job.
icon loader
Apply Now
icon loader Apply Now

Let AI Supercharge Your Job Hunt!

JobCopilot scans 500,000+ company career sites daily to find jobs for you

Never miss an opportunity Save hours by auto-filling applications forms Land more interviews with tailored applications
happy man
thunder iconActivate JobCopilot

Job Description - Site Reliability Engineer II

Company Description

Aqilea is an IT and engineering consulting partner that helps companies get more out of their technology and operations. With teams in Stockholm and Bangalore, we work closely with our clients to build solutions that fit their needs - from software development, AI and infrastructure engineering to industrial automation and embedded systems.

We combine strong technical expertise with a practical, business-focused approach to help organizations modernize, improve security, and scale with confidence. Above all, we focus on long-term partnerships built on trust, quality, and real results.

With us, you have great opportunities to take real steps in your career and the opportunity to take great responsibility.

About the Role

Company: Aqilea India

Role : Site Reliability Engineer(SRE)

Exp : 5 to 10 years

Location : Bangalore(Hybrid)

Job Summary

We are seeking an experienced Site Reliability Engineer (SRE) to join our cross-functional product team and drive operational excellence, reliability, and performance across our eCommerce platforms. The ideal candidate will possess strong expertise in SRE principles, DevOps practices, cloud technologies, and production support within a microservices-based architecture. This role focuses on ensuring application stability, proactive monitoring, incident management, automation, and continuous improvement of platform reliability.

Key Responsibilities

  • Work within cross-functional product teams as the reliability expert for assigned products or product areas.
  • Apply Site Reliability Engineering practices and standards in collaboration with SRE governance teams.
  • Ensure high-quality service delivery and provide operational KPI reporting.
  • Collaborate closely with product teams to maintain predictable operations and minimize production disruptions.
  • Drive continuous improvement initiatives by sharing best practices and enhancing operational processes.
  • Monitor, manage, troubleshoot, and resolve application and infrastructure issues across production environments.
  • Perform technical analysis and root cause investigations for complex production incidents.
  • Improve system reliability through proactive monitoring, alerting, and preventive measures.
  • Analyze application code and logs to identify opportunities for product and operational improvements.
  • Develop automation solutions for monitoring, housekeeping activities, and incident prevention.
  • Ensure application and environment stability, availability, and performance.
  • Automate development and operational processes using scripting and infrastructure automation tools.
  • Participate in on-call support rotations and resolve business-critical incidents within SLA targets.
  • Define and track reliability metrics including SLIs, SLOs, and Error Budgets.
  • Contribute to performance engineering and application reliability initiatives.

Required Skills & Qualifications

Technical Expertise

  • Minimum 5 years of experience in Site Reliability Engineering, Production Support, Operations, DevOps, or Software Development.
  • Strong experience supporting and operating eCommerce platforms.
  • Hands-on experience with DevOps practices, including CI/CD, automated testing, and release automation.
  • Experience troubleshooting complex distributed systems and microservices-based architectures.
  • Strong understanding of solution architecture and root cause analysis techniques.
  • Experience working with API-driven frameworks such as commerce tools, Fabric, or similar platforms.
  • Experience with ITIL processes and ITSM tools such as ServiceNow.
  • Knowledge of application reliability and performance engineering principles.
  • Experience supporting web, desktop, and mobile applications.

Cloud & Infrastructure

  • Hands-on experience with cloud platforms such as Microsoft Azure and/or Google Cloud Platform (GCP).
  • Experience with managed Kubernetes services such as AKS and/or GKE.
  • Experience provisioning and managing infrastructure using Terraform and/or Ansible.
  • Knowledge of cloud-native architecture, scalability, and reliability best practices.

Development & Automation

  • Proficiency in at least one programming language:
    • Python
    • Java
    • C#
    • Go
    • Ruby
  • Experience with GitHub Actions for CI/CD workflow development.
  • Familiarity with Azure DevOps and other deployment automation platforms.
  • Understanding of front-end technologies such as ReactJS, React Native, and Node.js is advantageous.

Monitoring & Reliability

  • Hands-on experience with observability and monitoring tools such as:
    • Splunk
    • Grafana
    • Similar monitoring platforms
  • Strong understanding of:
    • Service Level Indicators (SLIs)
    • Service Level Objectives (SLOs)
    • Error Budgets
    • Incident Management
    • Reliability Engineering Practices

Start: Immediate to 15 Days

Location: Bangalore (Hybrid)

Original job Site Reliability Engineer II posted on GrabJobs ©. To flag any issues with this job please use the Report Job button on GrabJobs.
Apply Now
Share Job
Share Job

Auto-Apply to Site Reliability Engineer Jobs with your AI JobCopilot

thunder icon Auto-Apply with AI

Similar Site Reliability Engineer Jobs in India

GrabJobs is the no1 job portal in India, connecting you to thousands of jobs fast! Find the best jobs in India, apply in 1 click and get a job today!

Mobile Apps

Copyright © 2026 Grabjobs Pte.Ltd. All Rights Reserved.