Logo-of-Qode-hiring-for-jobs-in-US-on-GrabJobs

Resilience Lead

icon building Company : Qode
icon briefcase Job Type : Full Time

Number of Applicants

 : 

000+

Click to reveal the number of candidates who applied for this job.
icon loader
icon loader

Let AI Supercharge Your Job Hunt!

JobCopilot scans 500,000+ company career sites daily to find jobs for you

Never miss an opportunity Save hours by auto-filling applications forms Land more interviews with tailored applications
happy man
thunder iconActivate JobCopilot

Job Description - Resilience Lead

Job Title: Resilience, Testability & Scalability Lead

Location: Fort Mill, SC / New York / New Jersey (Hybrid)

Data Platforms – Engineering Quality & Resilience Track

Role Overview:

We are looking for, technically strong Resilience, Testability & Scalability Lead to drive engineering excellence across our data platforms and cloud-based applications. This role is critical in ensuring system uptime, test automation maturity, performance under scale, and architectural resilience to meet stringent regulatory and service-level demands.

The ideal candidate will have a deep background in designing highly available systems, implementing robust disaster recovery, managing scalable cloud infrastructure, and building automated, testable, and observable platforms—especially within AWS and Kubernetes environments.

Key Responsibilities:

•Design and implement high availability and failover strategies across multi-zone AWS deployments

•Lead the development and execution of disaster recovery and business continuity plans, including RTO/RPO validation and cross-region strategies

•Define testability strategies, test data management frameworks, and performance testing protocols

•Enable infrastructure and application resilience by introducing circuit breakers, retry patterns, service meshes, and graceful degradation mechanisms

•Establish real-time monitoring, alerting, and log aggregation frameworks using tools like CloudWatch and Prometheus

•Drive test automation and quality engineering best practices, integrating with CI/CD pipelines

•Optimize application and data layer performance through query tuning, caching, and indexing strategies

•Scale data processing using distributed frameworks like Apache Spark, and implement event-driven stream processing with Kafka

•Collaborate with platform, DevOps, and SRE teams to ensure resource efficiency, cost control, and performance SLAs

•Contribute to regulatory readiness by enforcing security, encryption, and audit logging standards

Required Skills & Experience:

Infrastructure Resilience & DR:

•Multi-AZ deployments, auto-scaling, load balancing, circuit breakers

•Disaster recovery design: backup/restore, cross-region replication, RTO/RPO

Monitoring & Observability:

•Experience with CloudWatch, Prometheus, log aggregators

•Set up alerting for incident response, latency, throughput, and error rates

Application Resilience & Security:

•Error handling, service degradation, exponential backoff

•Security best practices: IAM policies, encryption at rest/transit

•Familiarity with FINRA/SIPC compliance standards (preferred)

Test Automation & Quality:

•Unit testing (e.g., pytest), integration testing, E2E automation

•Test data generation, synthetic data, environment provisioning

•Performance testing using JMeter, Gatling, stress and capacity testing

•Code reviews, static analysis, data validation, anomaly detection

Scalability & Optimization:

•Horizontal scaling using Kubernetes, Docker, service discovery

•API Gateway, caching layers (Redis, Memcached), DB partitioning

•Connection pooling, capacity planning, cost-aware architecture

Data & Stream Processing:

•Spark cluster management, parallel processing, big data optimization

•Kafka-based messaging, windowing, and aggregation for real-time data

Preferred Qualifications:

•Experience in financial services or regulated environments

•Familiarity with LPL’s enterprise data and platform modernization initiatives

•AWS or Kubernetes certifications

•Strong communication skills and cross-functional collaboration experience

Original job Resilience Lead posted on GrabJobs ©. To flag any issues with this job please use the Report Job button on GrabJobs.
Share Job
Share Job

Auto-Apply to Resilience Lead Jobs with your AI JobCopilot

thunder icon Auto-Apply with AI

Similar Resilience Lead Jobs in the US

GrabJobs is the no1 job portal in the US, connecting you to thousands of jobs fast! Find the best jobs in the US, apply in 1 click and get a job today!

Mobile Apps

Copyright © 2026 Grabjobs Pte.Ltd. All Rights Reserved.