Blog Post

Microsoft Blog for PostgreSQL
7 MIN READ

Bidirectional Replication with pglogical on Azure Database for PostgreSQL - a VNET guide

pberenguel's avatar
pberenguel
Icon for Microsoft rankMicrosoft
Mar 30, 2026

Editor’s Note: This article was written by Raunak Jhawar, a Chief Architect. Paula Berenguel and Guy Bowerman assisted with the final review, formatting and publication.

Overview

Bidirectional replication is one of the most requested topologies requiring writes in multiple locations, selective sync, geo-distributed active-active, or even accepting eventual consistency.

This is a deep technical walkthrough for implementing bidirectional (active‑active) replication on private Azure Database for PostgreSQL Server using pglogical, with a strong emphasis on VNET‑injected architectures. It explains the underlying networking and execution model covering replication worker placement, DNS resolution paths, outbound connectivity, and conflict resolution mechanics to show why true private, server‑to‑server replication is only achievable with VNET injection and not with Private Endpoints. It also analyzes the operational and architectural trade‑offs needed to safely run geo distributed, multi write PostgreSQL workloads in production.

This blog post focus on pglogical however, if you are looking for steps to implement it with logical replication or pros and cons of which approach, please refer to my definitive guid to bi-directional replication in Azure Database for PostgreSQL blog post

 

Why this is important?

This understanding prevents fundamental architectural mistakes (such as assuming Private Endpoints provide private outbound replication), reduces deployment failures caused by hidden networking constraints, and enables teams to design secure, compliant, low‑RPO active/active or migration architectures that behave predictably under real production conditions. It turns a commonly misunderstood problem into a repeatable, supportable design pattern rather than a trial‑and‑error exercise.

Active-Active bidirectional replication between instances 

Architecture context

This scenario targets a multi-region active-active write topology where both nodes are injected into the same Azure VNET (example - peered VNETs on Azure or even peered on-premises), both accept writes.

Common use case: Geo distributed OLTP with regional write affinity.

 

 

Step 1: Azure Infrastructure Prerequisites

Both server instances must be deployed with VNET injection. This is a deploy time decision and you cannot migrate a publicly accessible instance (with or without private endpoint) to VNET injection post creation without rebuilding it.

Each instance must live in a delegated subnet: Microsoft.DBforPostgreSQL/Servers. The subnet delegation is non-negotiable and prevents you from placing other resource types in the same subnet, so plan your address space accordingly.

If nodes are in different VNETs, configure VNET peering before continuing along with private DNS integration. Ensure there are no overlapping address spaces amongst the peered networks.

NSG rules must allow port 5432 between the two delegated subnets, both inbound and outbound. You may choose to narrow down the NSG rules to meet your organization requirements and policies to a specific source/target combination allow or deny list.

Step 2: Server Parameter Configuration

On both nodes, configure the following server parameters via the Azure Portal (Server Parameters blade) or Azure CLI. These cannot be set via ALTER SYSTEM SET commands.

wal_level = logical -- This setting enables logical replication, which is required for pglogical to function.

max_worker_processes = 16 -- This setting allows for more worker processes, which can help with replication performance.

max_replication_slots = 10 -- This setting allows for more replication slots, which are needed for pglogical to manage replication connections.

max_wal_senders = 10 -- This setting allows for more WAL sender processes, which are responsible for sending replication data to subscribers.

track_commit_timestamp = on -- This setting allows pglogical to track commit timestamps, which can be useful for conflict resolution and monitoring replication lag.

shared_preload_libraries = pglogical -- This setting loads the pglogical extension at server startup, which is necessary for it to function properly.

azure.extensions = pglogical -- This setting allows the pglogical extension to be used in the Azure Postgres PaaS environment.

 

Both nodes require a restart after shared_preload_libraries and wal_level changes.

Note that max_worker_processes is shared across all background workers in the instance. Each pglogical subscription consumes workers. If you are running other extensions, account for their worker consumption here or you will hit startup failures for pglogical workers.

Step 3: Extension and Node Initialization

Create a dedicated replication user on both nodes. Do not use the admin account for replication.

CREATE ROLE replication_user WITH LOGIN REPLICATION PASSWORD 'your_password';

GRANT USAGE ON SCHEMA public TO replication_user;

GRANT SELECT ON ALL TABLES IN SCHEMA public TO replication_user;

ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO replication_user;

Log into Server A either via a VM in the specified VNET or Azure Bastion Host and run the following which creates the extension, a replication set and policies.

CREATE EXTENSION IF NOT EXISTS pglogical;

SELECT pglogical.create_node(node_name := 'node_a', dsn := 'host.fqdn-for-server-a port=5432 dbname=preferred-database user=replication_user password=<strong_password>');

 

-- Define the replication set for Server A, specifying which tables to replicate and the types of operations to include (inserts, updates, deletes).

SELECT pglogical.create_replication_set(set_name := 'node_a_set',  replicate_insert := true,  replicate_update := true,  replicate_delete := true,  replicate_truncate := false);

 

-- Add sales_aus_central table explicitly

SELECT pglogical.replication_set_add_table(set_name := 'node_a_set',  relation := 'public.sales_aus_central',  synchronize_data := true);

 

-- Add purchase_aus_central table explicitly

SELECT pglogical.replication_set_add_table(set_name := 'node_a_set',  relation := 'public.purchase_aus_central',  synchronize_data := true);

 

-- OR add all tables in the public schema

SELECT pglogical.replication_set_add_all_tables('default', ARRAY['public']); -- This command adds all tables in the public schema to the default replication set.

 

-- Now, repeat this on Server B using the same method above i.e. via a VM in the specified VNET or Azure Bastion Host

CREATE EXTENSION IF NOT EXISTS pglogical;

-- Define the replication set for Server B, specifying which tables to replicate and the types of operations to include (inserts, updates, deletes)

SELECT pglogical.create_node(node_name := 'node_b', dsn := 'host-fqdn-for-server-b port=5432 dbname=preferred-database user=replication_user password=<strong_password>');

 

SELECT pglogical.create_replication_set(  set_name := 'node_b_set',  replicate_insert := true,  replicate_update := true,  replicate_delete := true,

  replicate_truncate := false);

 

-- Add sales_aus_east table explicitly

SELECT pglogical.replication_set_add_table(  set_name := 'node_b_set',  relation := 'public.sales_aus_east',  synchronize_data := true);

 

-- Add purchase_aus_east table explicitly

SELECT pglogical.replication_set_add_table(  set_name := 'node_b_set',  relation := 'public.purchase_aus_east',  synchronize_data := true);

 

-- OR add all tables in the public schema

SELECT pglogical.replication_set_add_all_tables('default', ARRAY['public']); -- This command adds all tables in the public schema to the default replication set.

It is recommended that you confirm the DNS resolution on all server’s involved as part of the replication process. For a VNET injected scenarios – you must get back the private IP.

As a sanity check, you can run the nslookup on the target server’s FQDN or even use the \conninfo command to see the connection details. One such example is here:

 

 

Step 4: Configuring the subscribers

SELECT pglogical.create_subscription ( -- Create a subscription on Server A to receive changes from Server B

subscription_name := 'node_a_to_node_b',

replication_sets := array['default'],

synchronize_data := true,

forward_origins := '{}',

provider_dsn := 'host=fqdn-for-server-b port=5432 dbname=preferred-database user=replication_user password=<strong_password>');

 

-- Run this on Server B to subscribe to changes from Server A

SELECT pglogical.create_subscription ( -- Create a subscription on Server B to receive changes from Server A

subscription_name := 'node_b_to_node_a',

replication_sets := array['default'],

synchronize_data := true,

forward_origins := '{}',

provider_dsn := 'host=fqdn-for-server-a port=5432 dbname=preferred-database user=replication_user password=<strong_password>');

 

For most OLTP workloads, last_update_wins using the commit timestamp is the most practical choice. It requires track_commit_timestamp = on, which you must set as a server parameter.

The FQDN must be used rather than using the direct private IP of the server itself.

 

Bidirectional replication between server instances with private endpoints – does this work and will this make your server security posture weak?

Where do pglogical workers run?

With VNET injection, the server's network interface lives inside your delegated subnet which is a must do. The PostgreSQL process including all pglogical background workers starts connections from within your VNET (delegated subnet). The routing tables, NSGs, and peering apply to both inbound and outbound traffic from the server.

With Private Endpoint, the architecture is fundamentally different:

 

 

Private endpoint is a one-way private channel for your clients or applications to reach the server securely. It does not give the any of server’s internal processes access to your VNET for outbound connectivity.

pglogical subscription workers trying to connect to another server are starting those connections from Microsoft's managed infrastructure and not from your VNET.

What works?

Scenario A: Client connectivity via private endpoint

Here you have application servers or VMs in your VNET connecting to a server configured with a private endpoint, your app VM connects to 10.0.0.15 (the private endpoint NIC), traffic flows over Private Link to the server, and everything stays private. This is not server-to-server replication.

Scenario B: Two servers, both with private endpoints

Here both servers are in Microsoft's managed network. They can reach each other's public endpoints, but not each other's private endpoints (which are in customer VNETs). The only path for bidirectional replication worker connections is to enable public network access on both servers with firewall rules locked down to Azure service IP.

 

 

Here you have private endpoints deployed alongside public access. Inside your VNET, SERVER A resolves to the private endpoint IP via the privatelink.postgres.database.azure.com private DNS zone. But the pglogical worker running in Microsoft's network does not have access to your private DNS zone and it resolves via public DNS, which returns the public IP.

This means if you are using the public FQDN for replication, the resolution path is consistent from the server's perspective (always public DNS, always public IP using the allow access to Azure services flag as shown above). Your application clients in the VNET will still resolve to the private endpoint.

If your requirement is genuinely private replication with no public endpoint exposure, VNET injection is the correct answer, and private endpoint cannot replicate that capability for pglogical.

Conclusion

The most compelling benefit in the VNET-injected topology is network isolation without sacrificing replication capability. You get the security posture of private connectivity i.e. no public endpoints, NSG controlled traffic, private DNS resolution all while keeping a live bidirectional data pipeline. This satisfies most enterprise compliance requirements around data transit encryption and network boundary control.

The hub/spoke migration (specifically, on-premises or external cloud to Azure) scenarios are where this approach shines. The ability to run both systems in production simultaneously, with live bidirectional sync during the cutover window, reduces migration risk when compared to a hard cutover.

From a DR perspective, bidirectional pglogical gives you an RPO measured in seconds (replication lag dependent) without the cost of synchronous replication. For workloads that can tolerate eventual consistency and have well-designed conflict avoidance this is a compelling alternative to synchronous streaming replication via read replicas, which are strictly unidirectional.

Updated Mar 27, 2026
Version 1.0
No CommentsBe the first to comment