Blog Post

Azure Database for PostgreSQL Blog
10 MIN READ

The definitive guide to Bi-directional Replication in Azure Database for PostgreSQL

pberenguel's avatar
pberenguel
Icon for Microsoft rankMicrosoft
Sep 03, 2025

Building a highly available, active-active database solution is a critical requirement for modern applications. For teams leveraging Azure Database for PostgreSQL, bi-directional replication is a powerful strategy to achieve this. This guide explores the two main approaches—the pglogical extension and native logical replication—providing a comprehensive comparison and step-by-step instructions to help you implement the right solution for your needs.

What is bi-directional replication?

Bi-directional replication allows two systems to exchange changes with each other, ensuring that both systems have identical data records. This is achieved by each system acting as both a publisher (source of data) and a subscriber (receiver of changes) to the other system. Two replication tasks are typically required: one for replicating changes from system 1 to system 2, and another for replicating changes from system 2 to system 1.

 

 

Benefits of bi-directional replication:

  1. It enables an active-active setup, significantly reducing downtime and improving availability
  2. It enables write capability on both the servers and syncs the data on both the servers
  3. Servers can be deployed across different regions (or tenants), providing robust disaster recovery and business continuity. In a regional outage, the other active server ensures your application remains available
  4. Minimal data loss in the event of a server failure. This is a asynchronous solution.

 

Options for Bi-directional Replication in Azure Database for PostgreSQL

When configuring bi-directional replication in Azure Database for PostgreSQL, you have two primary options: using pglogical or native logical replication.

This comparison outlines the key differences to help you choose the best approach for your use case:

 

 

pglogical

logical replication

DDL Replication

Supports DDL replication through dedicated extension APIs, allowing you to run DDL commands from a single node. The pglogical.replicate_ddl_command() API simplifies schema changes.

DDLs must be applied manually to both nodes. There is no built-in API to automatically replicate schema changes.

Replication Level

Database level.

Instance level.

PostgreSQL version support

The extension must be updated to support a new PostgreSQL community version release. This can introduce a delay between the new version's release and the extension's availability.

As a core PostgreSQL feature, it's expected to work with new PostgreSQL versions as soon as they're released, without needing a separate update.

Conflict detection and resolution

Advanced conflict handling. Provides various conflict resolution options, including using the remote row, keeping the local row, or stopping replication to allow for manual intervention. It can also convert an INSERT into an UPDATE to handle conflicting keys.

Limited conflict handling. An UPDATE is always applied, but an INSERT will fail if a row with the same primary key or unique constraint already exists. It lacks the built-in conflict resolution capabilities of pglogical.

Sequence replication

Supports sequence replication. You can add sequences to a replication set and synchronize their state using APIs such as pglogical.replication_set_add_sequence() and pglogical.replication_set_add_all_sequences().

Does not support sequence replication. Sequences must be managed individually and manually on each node to prevent conflicts.

Truncate support

Supports TRUNCATE. TRUNCATE commands are replicated. However, if the table has foreign-key dependencies, you must manually ensure that all dependent tables on the subscriber can be truncated to prevent replication failure.

Supports TRUNCATE, but with limitations. The TRUNCATE command is replicated, but it will fail if the table being truncated has foreign-key dependencies on a table not included in the same subscription.

Selective column support

Supports selective column replication. You can choose to replicate specific columns from a table at the publisher side.

Supports selective column replication. Starting with PostgreSQL 15, you can specify a column list in the CREATE PUBLICATION command to replicate a subset of columns at the publisher.

Selective row support

Supports row filtering. Allows for selective replication of rows at both the publisher or subscriber side using a row_filter option.

Supports row filtering at the publisher. Native logical replication supports filtering rows at the publisher side, but only for PostgreSQL versions 15+.

Schema copy

Supports schema synchronization. The extension provides an API to copy the schema to the subscriber node using the synchronize_structure option within pglogical.create_subscription().

Logical Replication does not have a built-in API to automatically copy the schema. You must manually create or apply the schema on each subscriber node before replication can begin.

Re-synchronize tables

Supports table resynchronization via a dedicated API: pglogical.alter_subscription_resynchronize_table(). This process truncates the table and re-loads all data, which may be a destructive operation.

Requires manual resynchronization. To re-synchronize a table, you must manually drop and then re-add the table to the publication.

Skills curve

Steeper learning curve. Requires knowledge of extension-specific APIs.

Lower learning curve. Logical Replication is a built-in PostgreSQL feature, which means DBAs already familiar with core PostgreSQL concepts can manage it with minimal additional learning.

Roadmap and future enhancements

The extension is not as actively developed as it was before the acquisition of its creator.

As a core component of PostgreSQL, it's a priority for ongoing development. It evolves with the PostgreSQL core capabilities, with new features and enhancements being added in each major release.

 

Understanding the challenges of bi-directional replication

1. Replication Lag

When there's a high volume of write operations, a delay can occur before changes are fully replicated between servers. This can lead to temporary data inconsistency across your nodes.

Recommendation: Set up alerts for when lag exceeds an acceptable threshold. Ensure that your servers have sufficient network bandwidth, Storage and I/O capacity to handle peak write loads.

2. Data Conflicts

Even when unique ID conflicts are prevented, simultaneous updates to the same record on both servers can lead to replication conflicts. Native logical replication does not have built-in conflict resolution, so a separate strategy is required to handle these cases.

Recommendation: Use a conflict resolution strategy. If you're using pglogical, take advantage of its built-in conflict handlers. For native logical replication, your application logic must handle potential conflicts, for example, by implementing "last writer wins" or other business-specific logic at the application layer.

3. Data Consistency

Maintaining data consistency can be challenging, particularly during network latency or outages. It's crucial to ensure that all changes on one server are accurately and promptly reflected on the other, as a failure to do so can lead to a state of data divergence between the two nodes.

Recommendation: Since temporary data divergence is a reality of asynchronous bi-directional replication, you must prepare for it. Design your application logic to tolerate periods where data is out of sync, often by ensuring operations are idempotent (can be re-run without issue). Implement a proactive monitoring solution that periodically compares data across nodes to detect any divergence. Finally, have a predefined reconciliation process in place, which may involve a custom script or tool that applies specific business logic to determine which record "wins" in a conflict.

4. Increased application and database complexity

Handling a bi-directional setup requires adding extra logic and tools to both your application and database layers. Without careful management and monitoring, this added complexity can lead to potential bugs and data inconsistencies.

Recommendation: Mitigate complexity with a well-planned architecture. Design your application in such a way that specific services are responsible for a distinct set of data. This allows you to direct writes for specific data to a single "authoritative" node, minimizing the need for complex, bi-directional write logic.

5. Performance Overhead

Bi-directional replication can introduce performance overhead. Every write operation on one server creates replication overhead, which can cause increased latency and delays in response times, especially during periods of high write activity on both servers

Recommendation: Ensure your server resources (CPU, memory, and I/O) are adequately provisioned to handle the replication overhead. Monitor the performance metrics of both your publisher and subscriber to identify and address bottlenecks.

6. Managing Schema Changes

Changes to the database schema (like adding columns or changing data types) must be carefully managed across all nodes. You must ensure that these changes are applied to all nodes consistently and without causing replication to fail.

Recommendation: Implement a disciplined, phased approach for schema changes. Write schema unit tests for validation and follow these schema change best practices to avoid breaking replication.

7. Monitoring and Maintenance

A bi-directional replication setup adds significant complexity to monitoring and maintenance. It requires you to track the health, performance, and replication states of multiple nodes.

Recommendation: Leverage the Azure Database for PostgreSQL built-in metrics and supplement them with the metrics specific to a replication setup. Use Azure Monitor for high-level resource tracking (CPU, I/O, storage) and configure alerts for these. For detailed replication status, you must query PostgreSQL's system views directly to monitor key indicators like replication lag, replication slot status, and WAL activity.

8. Failure Scenarios

Understanding how to manage failures is crucial. You must have a clear strategy for how the system handles writes when one server is down. You also need a plan for reconciling any data changes that occurred during the outage when the server comes back online.

Recommendation: Develop a clear disaster recovery plan. This plan should include detailed steps for how to direct traffic to the remaining healthy server and how to re-synchronize the failed server when it comes back online.

9. Testing and Debugging

Finding and fixing issues in a bi-directional setup is significantly more complex than in a single-node environment. The interaction between two active servers creates numerous potential failure points and race conditions. This makes thorough testing in non-production environments that accurately mirror your production setup an absolute necessity.

Recommendation: Create a comprehensive test suite that includes scenarios for concurrent writes, network latency, and server failures.

Conclusion

While bi-directional replication can significantly boost availability and performance, its complexities require careful planning. The key to a successful implementation is to have robust strategies for monitoring, conflict resolution, and maintenance. Thorough testing and a deep understanding of your application's architecture will be crucial to the success of your solution.

 

Now let’s dive deep into how to configure bi-directional replication in Azure Database for PostgreSQL.

Pre-requisites for Bi-directional replication in Azure Database for PostgreSQL

Before you begin configuring bi-directional replication, ensure your servers meet the following requirements.

1. Server Parameters configuration

The following parameters must be configured on all nodes (publishers and subscribers) with the following minimum values:

wal_level=logical
max_worker_processes=16
max_replication_slots=10
max_wal_senders=10
track_commit_timestamp=on
*  [ shared_preload_libraries=pglogical]
*  [ azure.extensions = pglogical]
*  [optional parameters when configuring using the pglogical extension]
2. User Privileges

 The user account used for replication on both servers must have the necessary privileges.

GRANT azure_pg_admin to demo;
ALTER ROLE demo REPLICATION LOGIN;

The demo user is the role configured at server creation for this example. If you need to create a dedicated replication user, follow these steps here

3. Object level requirements

The tables that you want to replicate must have a way to uniquely identify rows, so that changes can be applied correctly on the other server.

  • Tables must have either a PRIMARY KEY, a UNIQUE KEY, or be configured with REPLICA IDENTITY FULL. This ensures that the logical replication system can accurately identify and update the correct row on the subscriber side.
Step by Step configuration using native Logical Replication

Step 1) Create Table, Publication & Replication Slot

-- on Server 1
create table dummy_x(
id numeric primary key not null,
name text not null,
event_date timestamp default current_timestamp
);

create sequence sq_dummy_x
start with 1
increment by 2;

SELECT pg_create_logical_replication_slot('slot_pub_z1', 'pgoutput');
CREATE PUBLICATION pub_z1 FOR TABLE dummy_x;

--on Server 2
create table dummy_x(
id numeric primary key not null,
name text not null,
event_date timestamp default current_timestamp
);

create sequence sq_dummy_x
start with 2
increment by 2;

SELECT pg_create_logical_replication_slot('slot_pub_z2', 'pgoutput');
CREATE PUBLICATION pub_z2 FOR TABLE dummy_x;


Step 2) Create Subscriptions with Origin Filtering & NO DATA COPY

--on Server 1
CREATE SUBSCRIPTION sub_z2
CONNECTION 'host=<server2>.postgres.database.azure.com port=5432 user=demo dbname=postgres password=XXXXXXX'
PUBLICATION pub_z2
WITH (create_slot = false, slot_name='slot_pub_z2', origin = 'none', copy_data = false);

--on Server 2
CREATE SUBSCRIPTION sub_z1
CONNECTION 'host=<server1>.postgres.database.azure.com port=5432 user=demo dbname=postgres password=XXXXXX'
PUBLICATION pub_z1
WITH (create_slot = false, slot_name='slot_pub_z1', origin = 'none', copy_data = false);

 

Step 3) Test the replication

--on Server 1
INSERT INTO dummy_x (id, name, event_date)
VALUES (nextval('sq_dummy_x'), 'Sample Name x 1', DEFAULT),
       (nextval('sq_dummy_x'), 'Sample Name x 3', DEFAULT),
       (nextval('sq_dummy_x'), 'Sample Name x 5', DEFAULT);

--on Server 2
INSERT INTO dummy_x (id, name, event_date)
VALUES (nextval('sq_dummy_x'), 'Sample Name x 2', DEFAULT),
       (nextval('sq_dummy_x'), 'Sample Name x 4', DEFAULT),
       (nextval('sq_dummy_x'), 'Sample Name x 6', DEFAULT);

 

Step 4) Validate the bi-directional set up

In both servers, run:

Select * from dummy_x order by id asc;

 

Step by Step configuration using pglogical

Step 1) Create a table on Server 1 and Server 2

Please note that the schema must be identical in both servers before replication begins

--on Server 1
create table dummy_x(
id numeric primary key not null,
name text not null,
event_date timestamp default current_timestamp
);

--on Server 2
create table dummy_x(
id numeric primary key not null,
name text not null,
event_date timestamp default current_timestamp
);

 

Step 2) Create the provider nodes:

--on Server 1
select pglogical.create_node(node_name := 'primary1', dsn:= 'host= <server1>.postgres.database.azure.com port=5432 dbname=postgres user=demo password=<pwd>');

--on Server 2
select pglogical.create_node(node_name := 'primary2', dsn := 'host=<server2>.postgres.database.azure.com port=5432 dbname=postgres user=demo password=<pwd>' );

 

Step 3) Add table(s) to the replication set in both servers

--on Server 1
SELECT pglogical.replication_set_add_all_tables('default', ARRAY['public']);

--on Server 2
SELECT pglogical.replication_set_add_all_tables('default', ARRAY['public']);

You can add all the tables in namespace to the default replication set which enables (INSERT/UPDATE/DELETE). Make sure all the tables added to the replication set have primary key column in it. If not, the tables will not be added to the replication set.

    1. There are 3 preexisting replication sets default, default_insert_only and ddl_sql.
  1. Default: enables replicating all changes to the table.
  2. Default_insert_only: replicates only the inserts, it is meant for tables without a primary key.
  3. Ddl_sql: replicates the schema changes specified by pglogical.replicate_ddl_command

 

Step 4) Create the subscriber on Server 2

--on Server 2
select pglogical.create_subscription (
subscription_name := 'primary2-sub',
replication_sets := array['default'],
synchronize_data := true,
forward_origins := '{}'
provider_dsn := 'host=<server1>.postgres.database.azure.com port=5432 dbname=postgres user=demo password=<pwd>');

 

Step 5) Verify the subscription status on Server 1

--on Server 1
SELECT subscription_name, status FROM pglogical.show_subscription_status();

 

Step 6) Test the replication from Server 1 to Server 2

--on Server 1
INSERT INTO dummy_x (id, name, event_date)
VALUES (1, 'Sample Name x 1', DEFAULT),
       (3, 'Sample Name x 3', DEFAULT),
       (5, 'Sample Name x 5', DEFAULT);

--On Server 2
Select * from dummy_x;
--3 rows selected

By performing the above steps, you will be able to attain Uni-directional replication.

Let’s move on with the additional configuration on Server 2 to enable bi-directional replication:

 

Step 7)  Configure Server1 as subscriber of Server 2

--on Server 1
 select pglogical.create_subscription (
 subscription_name := 'primary1-sub',
 replication_sets := array['default'],
 synchronize_data := false,
 forward_origins := '{}'
 provider_dsn := 'host=<server2>.postgres.database.azure.com port=5432 dbname=postgres user=demo password=<pwd>');

In the above statement the synchronize_data is set to false to make sure the originated data is not replicated back.

 

Step 8) Test the replication from Server 2 to Server 1

--on Server 2
INSERT INTO dummy_x (id, name, event_date)
VALUES (2, 'Sample Name x 2', DEFAULT),
       (4, 'Sample Name x 4', DEFAULT),
       (6, 'Sample Name x 6', DEFAULT);


--on Server 2 and Server 1
Select * from dummy_x order by id asc;
-- 6 rows selected

 

You've now completed the setup for an active-active solution. The next step is to test your environment and begin implementing the best practices outlined in this guide.

Useful resources:

Tuning logical replication on Azure Database for PostgreSQL – Flexible Server

 

Updated Sep 04, 2025
Version 3.0