D

Senior Observability Engineer Team Lead

icon building Company : Datacrunch
icon briefcase Job Type : Full Time
icon remote-alt Remote / Work from Home

Number of Applicants

 : 

000+

Click to reveal the number of candidates who applied for this job.
icon loader
icon loader

Let AI Supercharge Your Job Hunt!

JobCopilot scans 500,000+ company career sites daily to find jobs for you

Never miss an opportunity Save hours by auto-filling applications forms Land more interviews with tailored applications
happy man
thunder iconActivate JobCopilot

Job Description - Senior Observability Engineer Team Lead



About:


We’re an ambitious, mission-driven group focused on making the world a better place by delivering affordable, environmentally sustainable AI compute for training and deploying machine learning models at scale.



Responsibilities:


  • Lead the design, deployment, and scaling of a 360-degree unified observability stack across infrastructure assets (network, storage, cloud, power, servers, VMs, services, security, compliance, customer-facing dashboards, Kubernetes, etc).
  • Use Grafana, Loki, and ELK stacks to build advanced monitoring, logging, and alerting solutions.
  • Identify and resolve critical flaws/issues, with a proven track record of saving organisations significant time or cost.
  • Orchestrate observability data to detect trends, forecast issues, and move from reactive to proactive monitoring.
  • Partner with engineering, SRE, and operations teams to create dashboards, alerts, and visualisations that enable actionable insights.
  • Manage end-to-end workflows at a senior level, ensuring observability practices are embedded across projects and aligned with business goals.
  • Define best practices, set standards for log/metadata organisation, and maintain clear documentation.




Qualifications:


  • Deep experience with the Grafana stack (Grafana, Loki, Mimir, Alloy).
  • Strong familiarity with the ELK/Opensearch stack (Elasticsearch, Logstash, Kibana, Fluentd, Filebeat, Metricbeat).
  • Solid understanding of Prometheus and related tooling (Prometheus, Thanos, Cortex, Exporters).
  • Strong background working across Linux environments at scale.
  • Knowledge of network observability tools such as NetFlow and syslog.
  • Experience with automation/configuration management (e.g., Ansible or similar).
  • Excellent written and spoken English communication skills, with the ability to influence both technical and non-technical stakeholders.
Nice-to-haves:
  • Leadership experience in observability or infrastructure teams.
  • Experience monitoring Kubernetes environments.
  • Exposure to the Influx stack (Telegraf, InfluxDB).
  • Familiarity with OpenStack environments.




What we offer:


  • Company equity - a true stake in our journey.
  • Competitive salary and benefits, including health insurance, lunch benefit, and an annual personal budget (for sport, transport, wellness, or culture).
  • Flexible working environment.
  • Opportunity to work with cutting-edge AI technologies.
  • Career growth within a mission-driven company.




Assessment Process:


1. Introductory chat (45 mins) - Meet with our Talent Partner to learn more about DataCrunch and share your career goals.
2. Technical interview (60 mins) - A deeper discussion of your expertise and technical experience with future colleagues.
3. Final interview (60 mins) - Meet with our CEO, CTO, and wider team.


Original job Senior Observability Engineer Team Lead posted on GrabJobs ©. To flag any issues with this job please use the Report Job button on GrabJobs.
Share Job
Share Job

Auto-Apply to Senior Observability Engineer Team Lead Jobs with your AI JobCopilot

thunder icon Auto-Apply with AI

Similar Senior Observability Engineer Team Lead Jobs in the US

GrabJobs is the no1 job portal in the US, connecting you to thousands of jobs fast! Find the best jobs in the US, apply in 1 click and get a job today!

Mobile Apps

Copyright © 2026 Grabjobs Pte.Ltd. All Rights Reserved.