Number of Applicants
:000+
Let AI Supercharge Your Job Hunt!
JobCopilot scans 500,000+ company career sites daily to find jobs for you
About Huawei
Huawei is a leading global provider of information and communications technology (ICT) infrastructure and smart devices. With integrated solutions across four key domains – telecom networks, IT, smart devices, and cloud services – we are committed to bringing digital to every person, home and organization for a fully connected, intelligent world.
At Huawei, innovation focuses on customer needs. We invest heavily in basic research, concentrating on technological breakthroughs that drive the world forward. We have more than 180,000 employees, and we operate in more than 170 countries and regions.
About the IRC
Huawei Ireland Research Centre (IRC) mission is to position Huawei as a recognized technology leader and a global provider of information and communications technology (ICT) solutions. To achieve this we are building an industry-recognized multi-discipline Research Centre of experts with focus on medium-term to long-term issues. The IRC will work closely with an open innovative ecosystem with Huawei customers to address real-world issues. The IRC will also engage with key European universities to build a basic research capability to support Huawei technical projects.
Job Overview
We are looking for a Mandarin-speaking specialist to support our technical teams on cloud reliability. He/she will contribute to one of the technical areas the lab currently is working on, e.g. Fault Localization Agent design and development. Meanwhile, he/she will also assist by researching relevant topics and translating complex technical concepts into clear, concise documentation. In this role, the specialist will conduct research, analyze publicly available information, and collaborate with technical experts to produce source code, reports, presentations, and other technical documentation.
Key Responsibilities
Qualifications
Project Examples
1. Explainable multivariant anomaly detection
Many existing multivariate anomaly detection solutions lack explainability, making fault localization challenging—especially since SRE engineers may distrust black-box outputs. Inspired by GNNs/GCNs, we designed and developed graph-based algorithms to detect anomalies across cloud infrastructure, including physical/virtual hardware and microservice layers. Our solution delivers reliable performance while maintaining resource efficiency, even in production environments.
2. AI Agent for fault management
Fault management remains a significant challenge for Site Reliability Engineers (SREs). AI agents present a promising solution to streamline this process and reduce manual effort. Currently, we are investigating and developing a series of specialized agents—such as detection and localization agents—for specific use cases, built on our agent platform. This work will help to build knowledge and experience on state-of-the-art AI technologies on LLM and causal analysis, such as intent understanding and COT(chain-of-though).
Privacy Statement
Please read and understand our West European Recruitment Privacy Notice before submitting your personal data to Huawei so that you fully understand how we process and manage your personal data received.
http://career.huawei.com/reccampportal/portal/hrd/weu_rec_all.html
Auto-Apply to Cloud Reliability Specialist Jobs with your AI JobCopilot
Copyright © 2026 Grabjobs Pte.Ltd. All Rights Reserved.