Descrição do Emprego - System Reliability Engineer
Leveraging our expertise in connectivity, our advanced IoT platform, and our extensive global reach, we deliver the results necessary for our customers' progress and success. We support businesses of all sizes and sectors in their efforts to connect for a better future. Our connection base has experienced a 20% year-over-year growth, reaching over 200 million connections by the end of the financial year 2025. Address long tail lower volume segment through digital self-service platform globally Develop and govern resilience strategies that span system architecture, deployment, monitoring, and incident response Design and implement fault injection testing, chaos engineering practices, and scenario-based simulations to validate platform robustness Collaborate with product, infrastructure, architecture and development teams to re-design services with built-in redundancy, failover, and graceful degradation Contribute to the design and maintenance of our Business Continuity and Disaster Recovery Plan (BD/DR), ensuring IoT systems remain resilient and recoverable in the face of unexpected distruptions Delivery focus - Consistently meet or exceed delivery expectations—ensuring the right customer experience, delivering tangible business outcomes, and achieving financial target Improved service-level attainment (SLA/SLO adherence) Manage stakeholders and vendors as required for the technical delivery and report project progress & activities Degree in Software Engineer or related discipline with Computer Science Good understanding of DevSecOps methodology mindset Good understanding of information security Scripting experience such as bash, python, perl, groovy, powershell Proven experience with high-availability system design, chaos engineering principes and proactive failure mitigation strategies Experience with ISO 22301 Good understanding of system monitoring tools and automated testing frameworks Industry experience with Software Platforms on Linux, on-premises and cloud Server technologies Deep understanding of SRE principles including SLOs/SLIs, error budgets, observability, toil reduction, and automation Demonstrated ability to balance operational stability with delivery velocity Understanding of security principles, practices and standards and how they translate into real-world technical solutions Hands-on experience with infrastructure provisioning and configuration management tools such as Terraform or Ansible. Demonstrated ability to eliminate manual processes through scripting (e.g., Python, Bash, Go) Strong command of telemetry, logging, and alerting stacks (e.g., Prometheus, Grafana, ELK, Datadog, Splunk) Experience defining meaningful SLIs and building dashboards that drive actionable insight Skilled in leading and participating in incident response with a calm, structured approach Experience driving blameless postmortems, root cause analysis, and continuous improvement across teams Good knowledge of DevSecOps principles Expertise in identifying and resolving system bottlenecks, latency issues, and throughput constraints Proficient in forecasting demand and managing system growth in a cost-efficient manner Proven ability to work closely with software engineers, infrastructure teams, product owners, and business stakeholders to embed reliability into the development lifecycle Consultative, customer-focused design mind-set Strong presentation and communication skills, to technical, business and (senior) management audience Strong work planning- and time management skills Willing to learn and a strong sense of ownsership and autonomy
Todos os Anúncios de Emprego estão sujeitos aos Terms of Service do GrabJobs. Permitimos que os usuários marquem postagens que possam estar em violação desses termos. Anúncios de emprego também podem ser marcados pela equipe de moderação do GrabJobs. No entanto, nenhum sistema de moderação é perfeito, e marcar uma postagem não garante que ela será removida.
Seja o primeiro a receber as últimas vagas Others Full-Time em Portugal.
Setup your job alert:
Ao ativar os alertas de emprego, eu concordo com os Terms & Privacy Policy do GrabJobs. Posso cancelar a inscrição nos alertas de emprego a qualquer momento.
Pular
Você atingiu seu número máximo de alertas de emprego.
O GrabJobs é o portal de empregos número 1 em Portugal, conectando você rapidamente a milhares de empregos de !
Encontre os melhores empregos de em Portugal, candidate-se com apenas 1 clique e consiga um emprego hoje!