Multi-Region Disaster Recovery: Streamlined with New Features

Former Employee

Nov 15, 2023

In our ongoing efforts to enhance data resilience and recovery capabilities, we are excited to announce the public preview of substantial enhancements to our multi-region disaster recovery (DR) strategy for Azure Database for PostgreSQL - Flexible Server. This upgrade is centered around two pivotal features: Virtual Endpoints and the Promote to primary server operation, each designed to simplify and strengthen DR capabilities.

Seamless Disaster Recovery with Virtual Endpoints and Promotion Options

A crucial aspect for any mission-critical application is maintaining operations during unforeseen regional disruptions. To address this, we've rolled out the Promote to primary server feature. This new addition allows for a swift and straightforward swap of roles between your primary database and its replica, ensuring continuous operations with minimal interruption.

In tandem with this, we introduce Virtual Endpoints: your database's steadfast links that remain unchanged, irrespective of role swaps. As a result, the writer endpoint will always connect to the current primary server, and the reader endpoint will point to the current replica. This feature ensures that, even after a promotion event, your application's connection strings remain valid, eliminating the need for manual updates and thereby streamlining your disaster recovery operations.

To locate this feature, simply navigate to the replication blade in the Azure portal. Here, you can manage these settings and familiarize yourself with the seamless disaster recovery options available.

Screenshot of replication blade in Azure portal.

Disaster Management Simplified

These enhancements collectively simplify your disaster recovery operations. They enable a hassle-free, manual failover to a read replica in a different (or the same) region and an equally smooth return to the primary server, streamlining the failover and failback processes. This simplicity is a game-changer for running regular disaster recovery drills, allowing for a swift and efficient response with virtually no negative impact on your applications.

Below you'll find a diagram that illustrates the pre-promotion configuration with virtual endpoints set up to point to the current primary and replica servers. After executing the promote to primary server operation, the roles of the servers are exchanged, and the virtual endpoints automatically adjust to align with the new server roles.

Diagram of promote to primary server flow.

What about the Previous Promote Feature?

It's important to note that the existing functionality, which allows a read replica to become a standalone server, has been renamed to "Promote to independent server and remove from replication". This clarifies the different use cases between creating an independent server versus swapping server roles within a replication setup.

You have the option to use virtual endpoints with the “Promote to independent server and remove from replication” action, although it's not a requirement. The accompanying diagram shows the change in server roles and how the virtual endpoints behave after the promote operation.

Diagram of promote to independent server flow.

Will I Lose Any Data During the Promote?

The promotion of a read replica to a primary server in Azure Database for PostgreSQL - Flexible Server can be executed in two ways:

Planned: This method is the default and safest approach for promoting a replica. It ensures that data on the replica is fully synchronized with the primary by applying all pending logs. Only after the data is consistent and up to date will the replica accept client connections, thereby safeguarding against data loss.
Forced: This option is tailored for scenarios that require a quick recovery, such as a regional outage. It prioritizes getting the replica server operational over data consistency. The server is promoted after processing the Write-Ahead Logging (WAL) files necessary to reach the nearest consistent state. Any transactions not replicated to the replica at the time it is delinked from the primary will be lost, with the amount of data loss corresponding to the replication lag at the time of promotion.

By choosing the correct promotion option for your situation, you can optimize for data consistency or recovery speed as required.

Where can I find more information on using new capabilities of read replica with Azure database for Postgres - Flexible Server?

Further information about the new read replica features in Azure Database for PostgreSQL - Flexible Server is available on read replica documentation. Additionally, for practical guidance, you can follow the how-to tutorial.

Updated Nov 14, 2023

Version 1.0

Azure Database for PostgreSQL Flexible Server

cross region replication

Disaster Recovery

Endpoint Management

geo disaster recovery

Microsoft Ignite 2023

PostgreSQL Flexible Server

Protection & Recovery

read replica

replication