Site Reliability Engineer at N-ix in Romania

Job Description - Site Reliability Engineer

We are looking for an experienced Site Reliability Engineer to ensure the stability, scalability, and operational excellence of a Kubernetes-based platform running in a hybrid environment.

The project is entering a pivotal phase, with a major go-live planned for mid-February and a target audience of 75,000 users. User onboarding is already underway, with over 5,000 users connected and 15,000–20,000 expected to be active by year-end. While the system is stable, we anticipate increased activity and new challenges in January, February, and after the go-live—making this an exciting opportunity to make a real impact. The role focuses on performance optimization, scaling strategies, observability, and reliability engineering.

Required Skills:

4+ years of experience as SRE / DevOps Engineer

Strong hands-on experience with Kubernetes in production

Experience working with hybrid infrastructure (on-prem + cloud)

Solid knowledge of PostgreSQL performance tuning and scaling

Experience with Qdrant or other vector databases

Experience with Helm, Kubernetes autoscaling, and resource optimization

Familiarity with observability stacks (Prometheus, Grafana, ELK/Loki)

Understanding of performance engineering and load testing

Experience with Linux systems and networking

Strong troubleshooting and incident-management skills

Nice to Have:

Experience with STACKIT or other sovereign clouds

Experience with PgBouncer

Knowledge of SRE practices (SLO/SLI)

Experience in regulated or public-sector environments

German language skills

Responsibilities:

Operate and optimize hybrid infrastructure (on-prem & STACKIT)

Manage and scale Kubernetes clusters

Optimize Helm charts, resource usage, and autoscaling

Conduct performance, load, and stress testing

Ensure reliability, availability, and monitoring of production systems

Tune and operate PostgreSQL

Operate and optimize vector databases (e.g. Qdrant)

Implement monitoring, logging, and alerting

Support incident response and capacity planning

We offer*:

Flexible working format - remote, office-based or flexible

A competitive salary and good compensation package

Personalized career growth

Professional development tools (mentorship program, tech talks and trainings, centers of excellence, and more)

Active tech communities with regular knowledge sharing

Education reimbursement

Memorable anniversary presents

Corporate events and team buildings

Other location-specific benefits

*not applicable for freelancers

Original job Site Reliability Engineer posted on GrabJobs ©. To flag any issues with this job please use the Report Job button on GrabJobs.

Site Reliability Engineer

Job Description - Site Reliability Engineer

Similar Site Reliability Engineer Jobs in Romania

Mobile Apps