Bilangan Pemohon
:000+
Let AI Supercharge Your Job Hunt!
JobCopilot scans 500,000+ company career sites daily to find jobs for you
GENERAL DESCRIPTION :
The Senior Executive (Production & Disaster Recovery) is responsible for ensuring the stability, availability, and recoverability of production systems and disaster recovery (DR) environments across KPJ Hospitals and Manages Private Cloud.
This role supports day-to-day production operations while planning, coordinating, and executing backup, DR readiness, and recovery activities to meet business continuity, regulatory, and operational requirements.
The role works closely with Infrastructure, Applications, CyberSecurity, and Vendors to ensure systems operate within agreed SLAs, RTOs, and RPOs, while minimizing downtime and operational risk.
JOB DESCRIPTION :
1.Production Operations & Support
Monitor and support production systems availability and performance
Ensure systems meet SLA, RTO, and RPO requirements
Coordinate issue resolution with internal teams and vendors
Support system changes, patching, and releases with minimal impact
Support after-hours and on-call activities as part of operational requirements and team rotation.
2.Server Infrastructure & Project Support
Prepare, evaluate, design, implement, and support server infrastructure initiatives, ensuring alignment with production stability and disaster recovery requirements
Support infrastructure refresh, upgrade, migration, and enhancement activities
Ensure project deliverables comply with operational, security, and DR standards
3.Disaster Recovery & Business Continuity
Manage and maintain DR environments and recovery procedures
Coordinate and execute DR drills, failover, and failback activities
Ensure DR plans and runbooks are tested, updated, and documented
Support audit, regulatory, and compliance requirements
4.Backup & Data Protection
Monitor backup operations and resolve failures
Perform restore testing to validate data integrity
Ensure backup policies align with retention and security standards
5.Incident & Problem Management
Act as key responder during production incidents and major outages
Participate in incident investigation and root cause analysis (RCA)
Provide timely status updates and operational reports
6.Governance & Continuous Improvement
Maintain SOPs, runbooks, and DR documentation
Follow change management and governance processes
Identify opportunities to improve system resilience and recovery capability
Perform ad-hoc tasks to support infrastructure operations and improvement initiatives across systems, identity, and related platforms as required
Mentor and guide junior engineers by promoting best practices, knowledge sharing, and operational excellence
JOB REQUIREMENT :
Education: Bachelor Degree in Information Technology/Computer Science
Knowledge and Experiences:
5+ Years of Relevant Experience: Proven track record with hands-on experience in managing and maintaining server infrastructure and operations.
Server Infrastructure Architecture & Design: In-depth understanding and experience in designing, architecting, and implementing server infrastructure to meet business needs and optimize performance.
Backup Infrastructure Management: Solid experience in designing, configuring, installing, and supporting backup infrastructure, including offsite backup solutions for data protection and business continuity.
Disaster Recovery Expertise: Experienced in managing disaster recovery processes, with direct involvement in running disaster recovery drills and handling actual recovery situations.
Cloud Operations: Hands-on experience with managing cloud operations (AWS, Azure, or Google Cloud) is highly desirable, including migration, scaling, and optimizing cloud environments.
Collaborative Team Player: Proven ability to work effectively in team environments, collaborating with colleagues at all levels of the organization to meet project goals and deadlines.
Positive Attitude & Continuous Learning: A proactive approach with a strong desire to learn and stay updated with the latest technologies and industry trends, adapting to new challenges with enthusiasm.
Skills & Competencies:
Veeam Certified Engineer (VMCE) or equivalent backup and data protection certification
VMware Certified Professional (VCP) or relevant virtualization certification (advantage)
Certifications related to disaster recovery, data protection, or infrastructure resilience (advantage)
Technical skills required
Enterprise backup and recovery platform administration (e.g., Veeam)
Backup infrastructure management and policy configuration
Disaster recovery planning and execution
Backup monitoring, troubleshooting, and restoration procedures
Virtualization platform integration with backup systems
Data protection and storage management concepts
Special skills required
Strong analytical and troubleshooting skills for backup and recovery operations
Ability to support DR drills, recovery testing, and service restoration activities
Good documentation and reporting skills for backup operations and compliance records
Awareness of data protection best practices and infrastructure resilience
Personal attributes
Strong analytical and troubleshooting skills for backup and recovery operations
Ability to support DR drills, recovery testing, and service restoration activities
Good documentation and reporting skills for backup operations and compliance records
Awareness of data protection best practices and infrastructure resilience
Auto-Apply to Senior Executive, Systems (Backup & Disaster Recovery) Jobs with your AI JobCopilot
Copyright © 2026 Grabjobs Pte.Ltd. All Rights Reserved.