Azure Database for PostgreSQL provides a seamless Major Version Upgrade (MVU) experience for your servers, which is important for security, performance, and feature enhancements. For production workloads, minimizing downtime during this upgrade is essential to maintain business continuity. This blog explores a practical approach to performing a Major Version Upgrade (MVU) with minimal downtime and maximum reliability using logical replication and virtual endpoints. Upgrading without disrupting your applications is critical. With this method, you can:
- Approach 1: Configure two servers where the publisher runs on the lower version and the subscriber on the higher version, perform MVU and then switch over using virtual endpoints. The time taken to restore the server is specific to your workloads on the primary server.
- Approach 2: Maintain two servers on different versions, use pg_dump and pg_restore to restore with data for production server, and perform a seamless switchover using virtual endpoints.
To enable logical replication on a table, it must have one of the following:
- A Primary Key, or
- A Unique Index
| Approach 1 | Approach 2 |
| Restores the instance using the same version. | Creates and restores the instance on a higher version. |
| Faster restore with PITR (Point-in-Time Recovery) but requires a few additional steps. | Takes longer to restore because it uses pg_dump and pg_restore commands but enables version upgrade during restore. |
| Best suited when speed is the priority to restore the server. | Best suited when you want to restore directly to a higher version, and it does not downtime for the MVU operation. |
Approach 1
Setup: Two servers, one for testing, and one for production. Here are the steps to follow:
- Create a virtual endpoint on the production server.
- Perform a Point-in-time-restore (PITR) from the first server (Production) and create your test server.
- Add a virtual endpoint for the test server.
- Establish logical replication between the two servers.
- Perform the Major Version Upgrade (MVU) on the test server.
- Validate data on the test server.
- Update virtual endpoints: Remove the endpoints from both servers, then assign the original production endpoint to the test server.
Step By Step Guide
- Environment Setup
- Two servers are involved:
Server 1: Current production server (Publisher)
Server 2: Restored server for MVU (Subscriber) - Create a virtual endpoint for the production server.
- Two servers are involved:
- Configure Logical Replication & Grant Permissions
- Enable replication parameters on the publisher:
wal_level = logical max_worker_processes = 16 max_replication_slots = 10 max_wal_senders = 10 track_commit_timestamp = on -
Grant replication role to the user
ALTER ROLE <user> WITH REPLICATION; GRANT azure_pg_admin TO <user>;
- Enable replication parameters on the publisher:
- Create tables and insert data
CREATE TABLE basic (id INTEGER NOT NULL PRIMARY KEY, a TEXT); INSERT INTO basic VALUES (1, 'apple'), (2, 'banana'); -
Set Up Logical Replication on the Production Server
-
Create a publication slot on the Production Server:
create publication <publisher-name>; alter publication <publisher-name> add table<table>; SELECT pg_create_logical_replication_slot(‘<publisher-name>’, ‘pgoutput’);
-
- Choose Restore Point:
- Determine the latest possible point in time (PIT) to restore the data from the source server. This Point in Time must be, necessarily, after you created the replication slot in Step 3.
- Provision Target Server via PITR:
- Use Azure Portal or Azure CLI to trigger a Point-in-Time Restore.
- This creates the test server based on the production backup.
- You are provisioning the test server based on the production server's backup capabilities. This test server will initially be a copy of the production server’s data state at a specific point in time.
- Configure Server Parameters on Test Server
wal_level = logical max_worker_processes = 16 max_replication_slots = 10 max_wal_senders = 10 track_commit_timestamp = on - Create the target server as subscriber & Advance Replication Origin:
- This is the crucial step that connects the test server (subscriber) to the production (publisher) server and manually tells the target where in the WAL log stream to begin reading changes, skipping the data already restored.
- Create Subscription: Creates a logical replication subscription on the test server, linking it to the source and specifying connection details, publication, and replication slot without copying existing data.
CREATE SUBSCRIPTION <subscriber-name>;CONNECTION 'host=<host-name>.postgres.database.azure.com port=5432 dbname=postgres user=<username> password=<password>' PUBLICATION <publisher-name> WITH ( copy_data = false, create_slot = false, enabled = false, slot_name = <publisher-name> ); - Retrieves the replication origin identifier and name on the test server, which is needed to advance the replication position.
SELECT roident, roname FROM pg_replication_origin; - Execute this query on the Production server: Fetches the replication slot name and the restart LSN from the source server, indicating where replication should resume.
SELECT slot_name, restart_lsn FROM pg_replication_slots WHERE slot_name = <publisher-name>; - On the test server execute this command: Manually advances the replication origin on the target server to skip already restored data and start replication from the correct WAL position.
SELECT pg_replication_origin_advance(roident, restart_lsn);
- Enable the target server as a subscriber of the source server
- With the target server populated and the replication origin advanced, you can start the synchronization.
ALTER SUBSCRIPTION <publisher-name> ENABLE; - The target server now starts consuming the WAL entries from the source, rapidly closing the gap on all transactions that occurred between the slot creation and the completion of the PITR.
- With the target server populated and the replication origin advanced, you can start the synchronization.
- Test Replication works
- Create a virtual endpoint for the test server, and validate the data on the test server
- Confirm that the synchronization is working by inserting a record on the production server and immediately verifying its presence on the test server.
- Perform Major Version Upgrade (MVU)
- Upgrade your test server, and validate all the new extensions and features by using the virtual endpoint for the test server
- Manage virtual endpoints
-
Once the data and all the new extensions are validated, drop the virtual endpoint on production server and recreate the same virtual endpoint on test server.
-
Key Considerations:
- Test server initially handles read traffic; writes remain on production server to avoid conflicts.
- Virtual endpoint creation: ~1–2 minutes per endpoint.
- Time taken for Point-in-time-restore depends on the workload that you have on the production server
Approach 2:
This approach enables a Major Version Upgrade (MVU) by combining logical replication with an initial dump and restore process. It minimizes downtime while ensuring data consistency. Create a new Azure Database for PostgreSQL Flexible Server instance using your desired target major version (e.g., PostgreSQL 17). Ensure the new server's configuration (SKU, storage size, and location) is suitable for your eventual production load. This approach enables the core benefit of a side-by-side migration, running two distinct database versions concurrently. The existing application remains connected to the source environment, minimizing risk and allowing the new target to be fully configured offline.
- Configure Role Privileges on Source and Target Servers
ALTER ROLE <replication_user> WITH REPLICATION; GRANT azure_pg_admin TO <replication_user>; - Check Prerequisites for Logical Replication Set these parameters on both source and target servers:
- Set these server parameters to at least the minimum recommended values shown below to enable and support the features required for logical replication.
wal_level=logical max_worker_processes=16 max_replication_slots=10 max_wal_senders=10 track_commit_timestamp=on
- Set these server parameters to at least the minimum recommended values shown below to enable and support the features required for logical replication.
- Ensure tables are ready:
- Each table to be replicated must have a primary key or unique identifier
- Create Publication and Replication Slot on Source
This slot tracks all changes from this point onward.create publication <publisher-name>; alter publication <publisher-name> add table<table>; SELECT pg_create_logical_replication_slot(‘<publisher-name>’, ‘pgoutput’); - Generate Schema and Initial Data Dump Run pg_dump after creating the replication slot:
- Perform the dump after creating the replication slot to capture a static starting point. Using an Azure VM is recommended for optimal network performance.
pg_dump -U demo -W -h <hostname>.postgres.database.azure.com -p 5432 -Fc -v -f dump.bak postgres -N pg_catalog -N cron -N information_schema
- Perform the dump after creating the replication slot to capture a static starting point. Using an Azure VM is recommended for optimal network performance.
- Restore Data into Target (recommended: Azure VM):
- This populates the target server with the initial dataset.
pg_restore -U demo -W -h <hostname>.postgres.database.azure.com -p 5432 --no-owner -Fc -v -d postgres dump.bak --no-aclCatch-Up Mechanism: While the restoration is ongoing, new transactions on the source are safely recorded by the replication slot. It is critical to have sufficient storage on the source to hold the WAL files during this initial period until replication is fully active.
- This populates the target server with the initial dataset.
- Create Subscription and Advance Replication Origin on Target:
- This step connects the test server (subscriber) to the production server (source) and manually tells the target where in the WAL log stream to begin reading changes, skipping the data already restored.
- Create subscription: Creates a logical replication subscription on the target server, linking it to the source and specifying connection details, publication, and replication slot without copying existing data.
CREATE SUBSCRIPTION <subscription-name> CONNECTION 'host=<hostname>.postgres.database.azure.com port=5432 dbname=postgres user=<username> password=<password>' PUBLICATION <publisher-name> WITH ( copy_data = false, create_slot = false, enabled = false, slot_name = '<publisher-name>); - Retrieves the replication origin identifier and name on the target server, which is needed to advance the replication position.
SELECT roident, roname FROM pg_replication_origin; - Fetches the replication slot name and the restart LSN from the source server, indicating where replication should resume.
SELECT slot_name, restart_lsn FROM pg_replication_slots WHERE slot_name = '<publisher-name>;
-
- Manually advances the replication origin on the target server to skip already restored data and start replication from the correct WAL position.
SELECT pg_replication_origin_advance('<roname>', '<restart_lsn>');
- Manually advances the replication origin on the target server to skip already restored data and start replication from the correct WAL position.
- Enable Subscription:
- With the target server populated and the replication origin advanced, you can start the synchronization.
ALTER SUBSCRIPTION <subscription-name> ENABLE;Result: The target server now starts consuming the WAL entries from the source, rapidly closing the gap on all transactions that occurred during the dump and restore process.
- With the target server populated and the replication origin advanced, you can start the synchronization.
- Validate Replication: Insert a record on the source and confirm it appears on the target:
- Perform Cutover
-
- Stop application traffic to the production database.
- Wait for the target database to confirm zero replication lag.
- Disable the subscription (ALTER SUBSCRIPTION logical_sub01 DISABLE;).
- Connect the application to the new Azure Database for PostgreSQL instance.
Utilize Virtual Endpoints or a CNAME DNS record for your database connection string. By simply pointing the endpoint/CNAME to the new server, you can switch your application stack without changing hundreds of individual configuration files, making the final cutover near-instantaneous.
Conclusion
This MVU strategy using logical replication and virtual endpoints provides a safe, efficient way to upgrade PostgreSQL servers without disrupting workloads. By combining replication, endpoint management, and automation, you can achieve a smooth transition to newer versions while maintaining high availability.
For an alternative approach, check out our blog on using the Migration Service for MVU:
Hacking the migration service in Azure Database for PostgreSQL