You will join the Datastore Team, a core component of the Smarsh Fabric platform that underpins our enterprise applications. The team enables self-service, next-generation data capabilities across the engineering organisation, ensuring long-term scalability, reliability,and innovation.
This role focuses on designing, building, and operating large-scale distributed data platforms supporting petabyte-scale environments across hundreds of clusters. You will own complex technical initiatives end-to-end and play a key role in ensuring our data infrastructure remains reliable, scalable, secure, and highly automated.
Core Responsibilities
- Design, build, and operate highly available, scalable clusters supporting core data technologies including MongoDB, ElasticSearch, and Apache Kafka.
- Own major platform components and deliver complex initiatives from design through to production.
- Implement and enhance architectural patterns and platform standards for reliability, scalability, and performance.
- Troubleshoot and resolve complex distributed systems issues across multi-cluster environments.
- Build and maintain Infrastructure as Code and CI/CD pipelines to ensure repeatable, scalable deployments.
- Contribute to observability, reliability, and operational excellence across the data platform estate.
- Collaborate with engineering teams and Product Management to align platform capabilities with product needs.
- Mentor junior engineers and contribute to knowledge sharing within the team.
- Participate in on-call rotations to support platform reliability and continuous improvement.
Skills & Experience
We’re looking for an experienced Platform Engineer with strong expertise in cloud-native and DevOps practices. The ideal candidate will have:
- 5+ years of experience in platform engineering, SRE, or infrastructure-focused roles.
- Strong experience operating at least one of MongoDB, Kafka, or ElasticSearch in production environments, including day-2 operations.
- Solid experience designing and operating Kubernetes environments and ecosystem tooling (e.g. Helm, ArgoCD) is a significant plus.
- Proficiency in at least one programming language (Python, Java, or similar).
- Experience with Infrastructure as Code tools such as Terraform.
- Hands-on experience working with major cloud platforms (AWS preferred).
- Experience implementing, maintaining and evolving observability solutions (e.g.Prometheus/Grafana or ELK).
- Good understanding of security principles and experience embedding security best practices into production environments and code promotion systems.
- Excellent collaboration and communication skills.
- Strong problem-solving ability and attention to detail
Desirable Experience:
- Experience contributing to internal platform APIs or automation tooling.
- Exposure to regulated environments.
- Interest in advancing platform engineering practices such as self-service infrastructure.
- We're especially interested in candidates that may have transitioned from Software
- Engineering into Platform Engineering.