In this post, Marcos Ortiz and Khubyar Behramsha from AWS explain how organizations can transition from a single-region to a multi-region API Gateway architecture with a failover mechanism independent of AWS control plane operations. The post emphasizes the importance of relying on the data plane for recoveries, discussing the need for independent failover control of services behind a shared public API.
The transition to multi-region architecture can be challenging for organizations, especially when ensuring independent failover of each service. The post presents a new approach where each service has its subdomain, enabling flexible failover options. By deploying services in both primary and secondary regions using the same custom domain configuration, organizations can achieve granular control over service failover.
The post also discusses an active-passive manual failover example that utilizes Amazon Route 53 Application Recovery Controller (ARC) to manage cluster endpoints for routing controls, removing the dependency on manual DNS record edits. By setting up routing controls and utilizing DNS health checks, organizations can redirect traffic seamlessly between regions.
To implement this architecture, organizations need a public domain, AWS Certificate Manager certificates, and follow detailed instructions to deploy multi-region stacks for services and APIs. Testing the failover mechanism involves running a provided script that checks responses from different regions after failover actions are triggered using AWS Route 53 ARC.
Overall, this solution ensures better control over critical workloads in a multi-region setup, allowing organizations to manage service-level failover independently. The separation of frontend and backend services reduces consumer impact and simplifies the transition from single to multi-region architectures. For more information on resilience and serverless learning, readers are encouraged to explore the AWS Architecture Blog and resources on serverless technologies.
Article Source
https://aws.amazon.com/blogs/compute/implementing-multi-region-failover-for-amazon-api-gateway/