Number of Applicants
:000+
Let AI Supercharge Your Job Hunt!
JobCopilot scans 500,000+ company career sites daily to find jobs for you
Where we Work
Udemy is a global company headquartered in San Francisco, with additional U.S. offices in Denver and Austin, and international hubs in Australia, India, Ireland, Mexico, and Türkiye. This is an in-office position, requiring three days a week in the office (Tuesday, Wednesday, Thursday) and flexibility on Mondays and Fridays.
About your skills [3-4 bullets]
Scope: You own the end to end development of systems by prioritizing work, understanding user requirements and understanding the tradeoffs between design decisions.
Decision Making: You use critical thinking to follow a defined decision making process and consider multiple perspectives. Upon making a decision, you are clear in your communication and ensure everyone is aligned in execution.
Coaching: You have strong coaching skills that allow you to actively listen and ask the kind of questions that will help you diagnose and effectively address issues.
About this role
At Udemy the SRE team manages infrastructure from our CDN right back as far as Datastores. In between, we own load balancers, kubernetes clusters and CI/CD.
We maintain and develop the tools that build our infrastructure such as Helm and Terraform.
We run development environments to enable our dev teams to build and test changes quickly.
We build tools to accommodate the needs of our internal customers using Python and Golang.
We respond to incidents and drive standards of reliability across the organisation and work closely with development teams to ensure best practices.
What you’ll be doing [6-8 bullets for role responsibilities to be edited as needed]
You’ll be the lead on projects developing and improving our infrastructure and tooling working with our team and teams across the engineering department.
You’ll act as a mentor to other engineers on the SRE team.
You’ll champion SRE best practices.
You’ll participate in an on-call rota.
What you’ll have [6-8 bullets for role requirements to be edited as needed]
Experience managing Kubernetes clusters and cloud environments.
Experience using infrastructure as code tools to deploy infrastructure.
Experience writing tools and applications using programming languages such as Python, Golang and Kotlin.
Experience being on call.
Experience working with a wide variety of engineering teams to guide them on best practices.
Good communication skills and an ability to both share and receive feedback in a responsible manner.
Extensive knowledge of cloud technologies. AWS is a particular advantage.
Experience managing containerised workloads using Kubernetes in a production environment.
Experience with programming languages such as Python, Golang and Kotlin.
Experience with infrastructure as code tools such as Terraform and Helm
Udemy
Udemy is an online learning and teaching marketplace with over 250,000 courses and 73 million students. Learn programming, marketing, data science and more.
Read more about the companyAuto-Apply to Site Reliability Engineer Jobs with your AI JobCopilot
Copyright © 2026 Grabjobs Pte.Ltd. All Rights Reserved.