Logo-of-XTREMAX-PTE.-LTD.-hiring-for-jobs-in-Singapore-on-GrabJobs

Cloud Operations Engineer

salary Salary :

$5,000 - 10,500 monthly

icon briefcase Job Type : Full Time

Number of Applicants

 : 

000+

Click to reveal the number of candidates who applied for this job.

Let AI Supercharge Your Job Hunt!

JobCopilot scans 500,000+ company career sites daily to find jobs for you

Never miss an opportunity Save hours by auto-filling applications forms Land more interviews with tailored applications
happy man
thunder iconActivate JobCopilot

Job Description - Cloud Operations Engineer

Responsibilities

Infrastructure &Operations

  • Develop automation and processes to enable teams to manage, scale, and monitor applications in datacenters and cloud environments.
  • Troubleshoot and resolve system related issues across platforms, including participating in on-call escalations for critical incidents.
  • Take ownership of end-to-end infrastructure and security solutions across the organization.
  • Deploy and manage monitoringtools to track infrastructure performance, utilization, and health.
  • Implement configuration management systems for business continuity and automate disaster recovery measures.
  • Provision virtual machines, databases, containers, licenses and other infrastructure resources for development teams.
  • Design, build, optimize, and monitor automation systems to identify bottlenecks and maximize service availability.
  • Perform capacity planning and resource forecasting to ensure infrastructure scales ahead of demand.
  • Own and manage SLA, SLO, and SLI definitions; track and report against service reliability targets.
  • Perform regular backup validation and disaster recovery drills to verify recoverability of critical systems.

Cloud Cost Management (FinOps)

  • Monitor and analyse cloud resource consumption to identify cost optimisation opportunities.
  • Implement tagging strategies, rightsizing, and reserved instance planning to control cloud spend.
  • Produce monthly cost reports and recommendations for engineering and management stakeholders.

System Monitoring & Performance

  • Monitor and analyse product runtime environments (production and non-production) to ensure optimal system performance.
  • Implement continuous improvement strategies to enhance system reliability and efficiency.
  • Deploy full-stack monitoring with predictive analytics (CloudWatch Anomaly Detection, Stackdriver, AzureMonitor).
  • Integrate alerting with central NOC/SOC for faster escalation and resolution.
  • Build and maintain monitoring dashboards in to surface real-time infrastructure health metrics.
  • Implement log aggregation and analysis using ELK Stack (Elasticsearch, Logstash, Kibana) or Splunk or equivalent (optional).

Incident & Problem Management

  • Manage application and securityincidents, performing problem determination and coordinating with internalteams and vendors for resolution.
  • Escalate issues as necessary tominimize business impact.
  • Lead and coordinate withoperations teams and vendors to ensure 24/7 system support availability.
  • Facilitate communicationbetween teams to resolve operational issues efficiently.
  • Conduct post-incident reviews (PIRs) and drive root cause analysis (RCA) to prevent recurrence.
  • Maintain and continuously improve runbooks and standard operating procedures (SOPs) for common incidents.

Operational Processes & Compliance

  • Develop and maintain operationsand process guides to meet audit and compliance requirements.
  • Handle day-to-day operationalactivities, analyse performance data, and prepare status reports forstakeholders and management.
  • Ensure operational processes align with IM8 and ISO 27001 standards.
  • Conduct periodic compliance drills and support audit preparation.
  • Manage change advisory board (CAB) submissions and coordinate change windows in accordance with ITIL change management practices.

Security

  • Implement security practices aligned with industry standards to protect organizational data and infrastructure.
  • Plan, implement, and monitor system security architecture, including threat and risk assessments.
  • Perform security checks such as vulnerability assessments and system hardening.
  • Apply secure configurations and security controls for infrastructure and applications.
  • Configure and manage network security controls including WAF, firewall rules, VPN gateways, and security groups.
  • Ensure network segmentation and least-privilege access across all environments.

Experience and Skills Needed

Core Technical Skills

Strong understanding of:

  • Networking
  • Windows Server administration(Active Directory, DNS, etc.)
  • Linux administration
  • Nginx
  • Squid forward proxy
  • GitLab Runner
  • Public cloud platforms (AWS, Azure, Google Cloud)
  • Terraform
  • Git and modern branching workflows
  • Scripting (PowerShell / Bash /Python)
  • Kubernetes administration
  • Ansible
  • Monitoring tools (Cloud native technology, Grafana, Prometheus)
  • ELKStack (Elasticsearch, Logstash, Kibana) or Splunk (Optional)
  • Cloud cost management tools (AWS Cost Explorer, Azure Cost Management, GCP Billing).
  • Network security tools: WAF, firewall rule management, VPN (site-to-site and client).

Infrastructure & Platform Experience

  • Experience working with high availability, high performance, and high security multi-data-center systems.
  • Experience with hybrid cloud environments.
  • Experience designing and maintaining network architecture including VPCs, subnets, peering, and transit gateways.
  • Hands-on experience with container orchestration platforms (Kubernetes, EKS, AKS, GKE) in production. (optional)

Must Have

  • A bachelor’s degree in computer science, Information Technology, or a related field.
  • 2–5 years of relevantexperience.
  • Proven experience as anOperations or Cloud Engineer, or in a similar IT role.
  • Familiarity with ITSM tools(e.g., Remedy, Zendesk, ServiceDesk) for change and incident managementworkflows.
  • Experience in implementingsecurity and access controls for production and test environments.
  • Proficiency with full stack monitoring tools (e.g., APM tools, CloudWatch, Stackdriver, OpenAPM stack).
  • Cloud infrastructure experience.
  • Strong problem-solving and communication skills, with the ability to explain complex issues to non-technical audiences.
  • A collaborative, resourceful mindset with the ability to deliver innovative solutions.
  • Experience with Linux and Windows administration.
  • Demonstrated experience managing SLAs, SLOs, and producing operational reports for stakeholders.
  • Experience with FinOps practices or cloud cost governance in a production environment.

Good to Have

  • Experience with Singapore Government Projects will be advantageous.
  • Database experience and scripting experience (Shell script / PowerShell / Python) are an advantage.

Certifications

  • AWS Certified Solutions Architect – Associate *Professional is a plus
  • Microsoft Certified: Azure Administrator Associate *Professional is a plus
  • Google Cloud – Associate Cloud Engineer (ACE)
  • ITIL 4 Foundation

By submitting your resume/CV, you consent and agree to allow the information provided to be used and processed by or on behalf of Xtremax Pte Ltd for purposes related to your registration of interest in current or future employment with us and for the processing of your application for employment.

You also represent to us that you have obtained the consent of your referees when you disclose to us their personal data for the purpose of conducting reference checks.

The personal data held by us relating to your application will be kept strictly confidential and in accordance with the PDPA. You may also refer to our Privacy Policy for more details here: 

We regret to inform you that should you not consent to providing the necessary data required for us to process your application, your application will be considered void.

Original job Cloud Operations Engineer posted on GrabJobs ©. To flag any issues with this job please use the Report Job button on GrabJobs.
Share Job
Share Job

About the Company

XTREMAX PTE. LTD.

Xtremax Pte Ltd is a web design and software development company based in Singapore that creates digital experiences to help our clients build sustainable relationships. Since opening our doors in 2003 in Singapore, we have grown to become a strategic digital partner to government, large organiza...

Read more about the company

Auto-Apply to Similar Jobs with your AI JobCopilot

thunder icon Auto-Apply with AI
💰

Technology Salaries

Similar Jobs in Singapore

GrabJobs is the no1 job portal in Singapore, connecting you to thousands of jobs fast! Find the best jobs in Singapore, apply in 1 click and get a job today!

Mobile Apps

Copyright © 2026 Grabjobs Pte.Ltd. All Rights Reserved.