Simple and cost effective application high availability using Pubsub2Inbox and regional persistent disk
You might have read Build HA services using regional Persistent Disk which very neatly outlines the different options for providing high availability in case of a zonal outage. While you can always perform the failover using manual operations using gcloud
, you may want to automate the failover. At minimum, this involves moving a data disk between two instances by using the force-attach
method.
The architecture
In this article, I’ll be providing an example of how to fail over an instance automatically based on a Pub/Sub message (a good candidate for the source are health check logs — since they only log on transition of states, ie. from healthy to unhealthy or vice versa). The setup will look like this:
- Two instances created in two different zones, with two zonal boot disks (the instances will be running Nginx just for demonstration purposes)
- A regional persistent disk that is swapped between two instances (and snapshotted just before the move for extra safety)
- Two unmanaged instance groups (UMIGs) containing the aforementioned instances
- One regional passthrough TCP load balancer (you can also use a HTTP load balancer) with a HTTP health check
The last piece of the puzzle is using my swiss army knife for Pub/Sub, Pubsub2Inbox. We’ll be using the computeengine
and loadbalancing
processors to perform the failover and to update the load balancer backend service. The architecture looks like this:
To decide when to do a failover, we’ll be using the health check log messages as the signal to initiate actions. This is easy for one reason: the health check only logs messages during transition from one status to another (eg. from HEALTHY
to UNHEALTHY
and vice versa), so we don’t need to filter or de-duplicate the Pub/Sub messages.
The full Terraform sample code is available at: https://github.com/GoogleCloudPlatform/pubsub2inbox/tree/main/examples/repd-failover
Code walkthrough
First we create two instances (just using Container OS with an Nginx container and a startup script that initializes and mounts the disk):
module "primary-vm" {
source = "github.com/GoogleCloudPlatform/cloud-foundation-fabric//modules/compute-vm?ref=daily-2023.11.11"
project_id = module.project.project_id
zone = var.zones.primary
name = "repd-failover-primary"
boot_disk = {
initialize_params = {
image = "projects/cos-cloud/global/images/family/cos-stable"
type = "pd-ssd"
size = 10
}
}
attached_disks = [{
name = google_compute_region_disk.data-disk.name
size = google_compute_region_disk.data-disk.size
source_type = "attach"
source = google_compute_region_disk.data-disk.id
}]
network_interfaces = [{
network = module.vpc.self_link
subnetwork = module.vpc.subnet_self_links[format("%s/%s", var.region, var.vpc_config.subnetwork)]
}]
tags = ["repd-failover"]
metadata = {
user-data = module.cos-nginx.cloud_config
google-logging-enabled = true
}
service_account = {
auto_create = true
}
}
module "secondary-vm" {
source = "github.com/GoogleCloudPlatform/cloud-foundation-fabric//modules/compute-vm?ref=daily-2023.11.11"
project_id = module.project.project_id
zone = var.zones.secondary
name = "repd-failover-secondary"
boot_disk = {
initialize_params = {
image = "projects/cos-cloud/global/images/family/cos-stable"
type = "pd-ssd"
size = 10
}
}
attached_disks = []
network_interfaces = [{
network = module.vpc.self_link
subnetwork = module.vpc.subnet_self_links[format("%s/%s", var.region, var.vpc_config.subnetwork)]
}]
tags = ["repd-failover"]
metadata = {
user-data = module.cos-nginx.cloud_config
google-logging-enabled = true
}
service_account = {
email = module.primary-vm.service_account_email
}
}
Then we create two Unmanaged Instance Groups that each hold one of the instances. Now we can plug in these UMIGs into a regional passthrough Network Load Balancer on TCP port 80 and a health check that just probes the root document:
module "nlb" {
source = "github.com/GoogleCloudPlatform/cloud-foundation-fabric//modules/net-lb-ext?ref=daily-2023.11.11"
project_id = module.project.project_id
region = var.region
name = "repd-failover"
backend_service_config = {
protocol = "TCP"
port_name = "http"
}
forwarding_rules_config = {
"" = {
ports = [80]
}
}
group_configs = {
umig-primary = {
zone = module.primary-vm.instance.zone
instances = [
module.primary-vm.self_link
]
named_ports = { "http" = 80 }
}
umig-secondary = {
zone = module.secondary-vm.instance.zone
instances = [
module.secondary-vm.self_link
]
named_ports = { "http" = 80 }
}
}
backends = [{
group = module.nlb.groups.umig-primary.self_link
}]
health_check_config = {
enable_logging = true
check_interval_sec = 5
healthy_threshold = 2
timeout_sec = 5
unhealthy_threshold = 4
http = {
port = 80
}
}
}
Then it’s time to put together a log sink in the project, that will capture the health check logs and send them over to a Pub/Sub topic we create:
module "project" {
source = "github.com/GoogleCloudPlatform/cloud-foundation-fabric//modules/project?ref=daily-2023.11.11"
name = var.project_id
project_create = false
services = [
"compute.googleapis.com",
"cloudfunctions.googleapis.com",
"run.googleapis.com",
]
logging_sinks = {
repd-failover-healthcheck = {
destination = module.pubsub.id
filter = <<-EOT
logName="projects/${var.project_id}/logs/compute.googleapis.com%2Fhealthchecks" AND
jsonPayload.healthCheckProbeResult.healthState="UNHEALTHY" AND
resource.type="gce_instance_group" AND
resource.labels.instance_group_name:"repd-failover"
EOT
type = "pubsub"
}
}
}
module "pubsub" {
source = "github.com/GoogleCloudPlatform/cloud-foundation-fabric//modules/pubsub?ref=daily-2023.11.11"
project_id = var.project_id # Use var to avoid circular dependency
name = "repd-failover"
iam = {
}
}
Finally we’ll create use a configuration template for Pubsub2Inbox and plug in the necessary values from Terraform into the template and deploy a Cloud Function v2 in the project:
module "pubsub2inbox" {
source = "../.."
project_id = module.project.project_id
region = var.region
function_name = "repd-failover"
function_roles = ["compute-engine"]
cloud_functions_v2 = true
service_account = "repd-failover-pubsub2inbox"
pubsub_topic = module.pubsub.id
config = templatefile("${path.module}/repd-failover.yaml.tpl", {
concurrency_bucket = module.lock-bucket.name
project = module.project.project_id
primary = {
instance = module.primary-vm.instance.name
zone = module.primary-vm.instance.zone
instance_group = module.nlb.group_self_links["umig-primary"]
}
secondary = {
instance = module.secondary-vm.instance.name
zone = module.secondary-vm.instance.zone
instance_group = module.nlb.group_self_links["umig-secondary"]
}
regional_disk = {
id = google_compute_region_disk.data-disk.id
region = var.region
device_name = google_compute_region_disk.data-disk.name
}
load_balancer = {
backend_service = module.nlb.backend_service.name
region = var.region
}
})
use_local_files = true
local_files_path = "../.."
bucket_name = format("repd-failover-source-%s", random_id.random.hex)
bucket_location = var.region
}
If you cloned the example code, you can apply the Terraform into your project (it will create VPC and all the necessary infrastructure components):
examples/repd-failover$ terraform apply -var project_id=your-project-id
Plan: 44 to add, 0 to change, 0 to destroy.
Changes to Outputs:
+ instructions = (known after apply)
Do you want to perform these actions?
Terraform will perform the actions described above.
Only 'yes' will be accepted to approve.
Enter a value: yes
Apply complete! Resources: 44 added, 0 changed, 0 destroyed.
Outputs:
instructions = <<EOT
To access the application, open: http://34.123.123.123/
EOT
Now the setup should be activate and serving on the load balancer IP address:
We can now test the failover by either stopping the Nginx container or more easily by just terminating the current instance:
In a minute or two, the disk should have failed over to the other zone and the instance in the other zone should be running. We can check the load balancer status to see that the backends were updated and that the application is again serving traffic:
Conclusion
While there are many fine options for doing high availability (and not of all of them cloud specific), this method achieves a relatively application independent way of providing high availability.
However, it may not be the best option in all cases, especially with more complex application. You could also consider solutions like:
- Pacemaker (open source cluster manager)
- Using managed services (Cloud Functions, GKE, etc.)
- Running in regional managed instance groups and replicating data via Cloud Storage or filer, application-level replication mechanisms or database clustering services
One other thing to keep in mind is that force-attaching a disk to another instance is a disruptive operation from I/O perspective and requires the filesystem to support journaling correctly (there is always a non-zero chance of filesystem issues due to possibly lurking bugs). In the example, we’ve added automatic snapshotting of the disk to reduce the chance of data loss.
Also, if you run terraform apply
again, Terraform will want to switch the disk back to the primary instance and change the backend to primary UMIG. Before performing further Terraform configuration changes, you should make sure the application is being served from the primary VM (or augment the modules to have ignore_changes
for certain fields).
Finally, zonal outages may not be as clear as a complete outage. You can also fail over the application manually by posting a message containing just {}
to the Pub/Sub topic.