High availability for VPC reference architecture
The High availability (HA) overview desribes how HA is important for all cloud-based applications. In this article, you will learn more specifics about HA in the VPC reference architecture.
Your workloads on VPC infrastructure
Workloads in multizone regions
Distribute workloads across zones
Recall that all VPCs exist in a single region referring to the geographic area in which a VPC is deployed. It is recommended that you distribute your workloads across multiple zones within a region as the first step in creating an HA solution. That is, if one zone goes down, the other zones can continue to support the workload. You should use Application Load Balancer for VPC (ALB) to distribute your traffic across zones in a region.
Scale based on resource demands
Even with multiple zones, you should take additional steps to ensure that your environment can be scaled out based on the resources used by your workloads to enhance resiliency.
Virtual servers
With Auto Scale for VPC, you can improve performance and costs by dynamically creating virtual server instances to meet the demands of your environment. Auto Scale for VPC is highly recommended if you are using virtual servers. You set scaling policies that define your desired average utilization for metrics like CPU, memory, and network usage. The policies that you define determine when virtual server instances are added or removed from your instance group. Auto Scale for VPC is highly recommended if you are using virtual servers.
Containers
For Red Hat OpenShift on IBM Cloud, you can use the cluster-autoscaler
add-on to scale the worker pools in your cluster automatically to increase or decrease the number of worker nodes in the worker pool based on the sizing
needs of your scheduled workloads. The cluster-autoscaler
add-on is based on the Kubernetes Cluster-Autoscaler project. See Autoscaling clusters for more information.
In addition, it is possible to autoscale your pods. See Scaling apps for more information.
Workloads in multiple regions
Despite the steps you take to leverage multiple zones in a region and increase resiliency with auto scaling, it is still possible for an entire region to be taken out of service. For example, a natural disaster like a hurricane, tornado, or earthquake could knock out multiple zones in a region.
So, it is also recommended to establish and configure an alternate processing site in at least one geographically separate multizone region to ensure the continuation of secure system operation. Then, in addition to spreading your workloads across zones in a region, you can distribute your workloads across regions. However, your consumers must be able to opt out of having their data stored in the alternate region based on their data residency requirements.
The diagram below shows an expansion of the VPC reference architecture to use multiple regions. The key to making this work is to use the global load balancing functionality in DNS Services. The global load balancer offers HA and geographical distribution of your traffic, based on the health of your origin servers and the geographical region where the user request originates. So, if one region becomes unhealthy, then traffic would get routed to the next closest healthy region.
For completeness, the diagram below shows the VPC architecture spread across multiple regions when Red Hat OpenShift on IBM Cloud is used.
See the following tutorials for more information:
HA for IBM Cloud services
Your high availability strategy needs to consider all of the IBM Cloud services in your deployment. See High availabiltty for IBM Cloud services for more information.