flexible server
21 TopicsIgnite 2022 - Continued innovation with Azure Database for PostgreSQL flexible server
We are committed to making Azure Database for PostgreSQL flexible server the most advanced Postgres database service for migrating your open-source and commercial database workloads or building cloud native applications in Azure. Azure Database for PostgreSQL combines the fully managed community edition of Postgres database with the broadest Postgres extension support for existing or new application development. Flexible server is the top destination for Postgres workloads migrating and modernizing to Azure because of the cost optimization benefits, maximum control over your databases, and a simplified deployment experience.5.7KViews4likes1CommentAnnouncing Data API Builder (Preview) for PostgreSQL
Data API Builder eliminates much of the complexity to build GraphQL API or REST APIs and provides secure access to your data when using traditional approaches that require significant custom code and extended development time. Today we are thrilled to announce public preview of Data API Builder.4.1KViews3likes0CommentsBidirectional Replication with pglogical on Azure Database for PostgreSQL - a VNET guide
Editor’s Note: This article was written by Raunak Jhawar, a Chief Architect. Paula Berenguel and Guy Bowerman assisted with the final review, formatting and publication. Overview Bidirectional replication is one of the most requested topologies requiring writes in multiple locations, selective sync, geo-distributed active-active, or even accepting eventual consistency. This is a deep technical walkthrough for implementing bidirectional (active‑active) replication on private Azure Database for PostgreSQL Server using pglogical, with a strong emphasis on VNET‑injected architectures. It explains the underlying networking and execution model covering replication worker placement, DNS resolution paths, outbound connectivity, and conflict resolution mechanics to show why true private, server‑to‑server replication is only achievable with VNET injection and not with Private Endpoints. It also analyzes the operational and architectural trade‑offs needed to safely run geo distributed, multi write PostgreSQL workloads in production. This blog post focus on pglogical however, if you are looking for steps to implement it with logical replication or pros and cons of which approach, please refer to my definitive guid to bi-directional replication in Azure Database for PostgreSQL blog post Why this is important? This understanding prevents fundamental architectural mistakes (such as assuming Private Endpoints provide private outbound replication), reduces deployment failures caused by hidden networking constraints, and enables teams to design secure, compliant, low‑RPO active/active or migration architectures that behave predictably under real production conditions. It turns a commonly misunderstood problem into a repeatable, supportable design pattern rather than a trial‑and‑error exercise. Active-Active bidirectional replication between instances Architecture context This scenario targets a multi-region active-active write topology where both nodes are injected into the same Azure VNET (example - peered VNETs on Azure or even peered on-premises), both accept writes. Common use case: Geo distributed OLTP with regional write affinity. Step 1: Azure Infrastructure Prerequisites Both server instances must be deployed with VNET injection. This is a deploy time decision and you cannot migrate a publicly accessible instance (with or without private endpoint) to VNET injection post creation without rebuilding it. Each instance must live in a delegated subnet: Microsoft.DBforPostgreSQL/Servers. The subnet delegation is non-negotiable and prevents you from placing other resource types in the same subnet, so plan your address space accordingly. If nodes are in different VNETs, configure VNET peering before continuing along with private DNS integration. Ensure there are no overlapping address spaces amongst the peered networks. NSG rules must allow port 5432 between the two delegated subnets, both inbound and outbound. You may choose to narrow down the NSG rules to meet your organization requirements and policies to a specific source/target combination allow or deny list. Step 2: Server Parameter Configuration On both nodes, configure the following server parameters via the Azure Portal (Server Parameters blade) or Azure CLI. These cannot be set via ALTER SYSTEM SET commands. wal_level = logical -- This setting enables logical replication, which is required for pglogical to function. max_worker_processes = 16 -- This setting allows for more worker processes, which can help with replication performance. max_replication_slots = 10 -- This setting allows for more replication slots, which are needed for pglogical to manage replication connections. max_wal_senders = 10 -- This setting allows for more WAL sender processes, which are responsible for sending replication data to subscribers. track_commit_timestamp = on -- This setting allows pglogical to track commit timestamps, which can be useful for conflict resolution and monitoring replication lag. shared_preload_libraries = pglogical -- This setting loads the pglogical extension at server startup, which is necessary for it to function properly. azure.extensions = pglogical -- This setting allows the pglogical extension to be used in the Azure Postgres PaaS environment. Both nodes require a restart after shared_preload_libraries and wal_level changes. Note that max_worker_processes is shared across all background workers in the instance. Each pglogical subscription consumes workers. If you are running other extensions, account for their worker consumption here or you will hit startup failures for pglogical workers. Step 3: Extension and Node Initialization Create a dedicated replication user on both nodes. Do not use the admin account for replication. CREATE ROLE replication_user WITH LOGIN REPLICATION PASSWORD 'your_password'; GRANT USAGE ON SCHEMA public TO replication_user; GRANT SELECT ON ALL TABLES IN SCHEMA public TO replication_user; ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO replication_user; Log into Server A either via a VM in the specified VNET or Azure Bastion Host and run the following which creates the extension, a replication set and policies. CREATE EXTENSION IF NOT EXISTS pglogical; SELECT pglogical.create_node(node_name := 'node_a', dsn := 'host.fqdn-for-server-a port=5432 dbname=preferred-database user=replication_user password=<strong_password>'); -- Define the replication set for Server A, specifying which tables to replicate and the types of operations to include (inserts, updates, deletes). SELECT pglogical.create_replication_set(set_name := 'node_a_set', replicate_insert := true, replicate_update := true, replicate_delete := true, replicate_truncate := false); -- Add sales_aus_central table explicitly SELECT pglogical.replication_set_add_table(set_name := 'node_a_set', relation := 'public.sales_aus_central', synchronize_data := true); -- Add purchase_aus_central table explicitly SELECT pglogical.replication_set_add_table(set_name := 'node_a_set', relation := 'public.purchase_aus_central', synchronize_data := true); -- OR add all tables in the public schema SELECT pglogical.replication_set_add_all_tables('default', ARRAY['public']); -- This command adds all tables in the public schema to the default replication set. -- Now, repeat this on Server B using the same method above i.e. via a VM in the specified VNET or Azure Bastion Host CREATE EXTENSION IF NOT EXISTS pglogical; -- Define the replication set for Server B, specifying which tables to replicate and the types of operations to include (inserts, updates, deletes) SELECT pglogical.create_node(node_name := 'node_b', dsn := 'host-fqdn-for-server-b port=5432 dbname=preferred-database user=replication_user password=<strong_password>'); SELECT pglogical.create_replication_set( set_name := 'node_b_set', replicate_insert := true, replicate_update := true, replicate_delete := true, replicate_truncate := false); -- Add sales_aus_east table explicitly SELECT pglogical.replication_set_add_table( set_name := 'node_b_set', relation := 'public.sales_aus_east', synchronize_data := true); -- Add purchase_aus_east table explicitly SELECT pglogical.replication_set_add_table( set_name := 'node_b_set', relation := 'public.purchase_aus_east', synchronize_data := true); -- OR add all tables in the public schema SELECT pglogical.replication_set_add_all_tables('default', ARRAY['public']); -- This command adds all tables in the public schema to the default replication set. It is recommended that you confirm the DNS resolution on all server’s involved as part of the replication process. For a VNET injected scenarios – you must get back the private IP. As a sanity check, you can run the nslookup on the target server’s FQDN or even use the \conninfo command to see the connection details. One such example is here: Step 4: Configuring the subscribers SELECT pglogical.create_subscription ( -- Create a subscription on Server A to receive changes from Server B subscription_name := 'node_a_to_node_b', replication_sets := array['default'], synchronize_data := true, forward_origins := '{}', provider_dsn := 'host=fqdn-for-server-b port=5432 dbname=preferred-database user=replication_user password=<strong_password>'); -- Run this on Server B to subscribe to changes from Server A SELECT pglogical.create_subscription ( -- Create a subscription on Server B to receive changes from Server A subscription_name := 'node_b_to_node_a', replication_sets := array['default'], synchronize_data := true, forward_origins := '{}', provider_dsn := 'host=fqdn-for-server-a port=5432 dbname=preferred-database user=replication_user password=<strong_password>'); For most OLTP workloads, last_update_wins using the commit timestamp is the most practical choice. It requires track_commit_timestamp = on, which you must set as a server parameter. The FQDN must be used rather than using the direct private IP of the server itself. Bidirectional replication between server instances with private endpoints – does this work and will this make your server security posture weak? Where do pglogical workers run? With VNET injection, the server's network interface lives inside your delegated subnet which is a must do. The PostgreSQL process including all pglogical background workers starts connections from within your VNET (delegated subnet). The routing tables, NSGs, and peering apply to both inbound and outbound traffic from the server. With Private Endpoint, the architecture is fundamentally different: Private endpoint is a one-way private channel for your clients or applications to reach the server securely. It does not give the any of server’s internal processes access to your VNET for outbound connectivity. pglogical subscription workers trying to connect to another server are starting those connections from Microsoft's managed infrastructure and not from your VNET. What works? Scenario A: Client connectivity via private endpoint Here you have application servers or VMs in your VNET connecting to a server configured with a private endpoint, your app VM connects to 10.0.0.15 (the private endpoint NIC), traffic flows over Private Link to the server, and everything stays private. This is not server-to-server replication. Scenario B: Two servers, both with private endpoints Here both servers are in Microsoft's managed network. They can reach each other's public endpoints, but not each other's private endpoints (which are in customer VNETs). The only path for bidirectional replication worker connections is to enable public network access on both servers with firewall rules locked down to Azure service IP. Here you have private endpoints deployed alongside public access. Inside your VNET, SERVER A resolves to the private endpoint IP via the privatelink.postgres.database.azure.com private DNS zone. But the pglogical worker running in Microsoft's network does not have access to your private DNS zone and it resolves via public DNS, which returns the public IP. This means if you are using the public FQDN for replication, the resolution path is consistent from the server's perspective (always public DNS, always public IP using the allow access to Azure services flag as shown above). Your application clients in the VNET will still resolve to the private endpoint. If your requirement is genuinely private replication with no public endpoint exposure, VNET injection is the correct answer, and private endpoint cannot replicate that capability for pglogical. Conclusion The most compelling benefit in the VNET-injected topology is network isolation without sacrificing replication capability. You get the security posture of private connectivity i.e. no public endpoints, NSG controlled traffic, private DNS resolution all while keeping a live bidirectional data pipeline. This satisfies most enterprise compliance requirements around data transit encryption and network boundary control. The hub/spoke migration (specifically, on-premises or external cloud to Azure) scenarios are where this approach shines. The ability to run both systems in production simultaneously, with live bidirectional sync during the cutover window, reduces migration risk when compared to a hard cutover. From a DR perspective, bidirectional pglogical gives you an RPO measured in seconds (replication lag dependent) without the cost of synchronous replication. For workloads that can tolerate eventual consistency and have well-designed conflict avoidance this is a compelling alternative to synchronous streaming replication via read replicas, which are strictly unidirectional.304Views2likes0CommentsManaging bloat in PostgreSQL using pgstattuple on Azure Database for PostgreSQL flexible server
Bloat refers to the unused space within database objects like tables and indexes, caused by accumulated dead tuples that have not been reclaimed by the storage engine. This often results from frequent updates, deletions, or insertions, leading to inefficient storage and performance issues. Addressing bloat is crucial for maintaining optimal database performance, as it can significantly impact storage efficiency, increase I/O operations, reduce cache efficiency, prolong vacuum times, and slow down index scans. In this blog post, I will walk you through how to use the pgstattuple extension in PostgreSQL to analyze and understand the physical storage of your database objects. By leveraging pgstattuple, you can identify and quantify the unused space within your tables and indexes. We will guide you through analyzing bloat, interpreting the results, and addressing the bloat to optimize your database and improve its performance. I will be using the pg_repack extension as an alternative to VACUUM FULL and REINDEX. pg_repack is a PostgreSQL extension that removes bloat from tables and indexes and reorganizes them more efficiently. pg_repack works by creating a new copy of the target table or index, applying any changes that occurred during the process, and then swapping the old and new versions atomically. pg_repack doesn't require any downtime or exclusive access locks on the processed table or index, except for a brief period at the beginning and at the end of the operation. Performing a full table repack requires free disk space about twice as large as the target table(s) and its indexes. For example, if the total size of the tables and indexes to be reorganized is 1GB, an additional 2GB of disk space is required. For pg_repack to run successfully on a table you must have either a PRIMARY KEY or a UNIQUE index on a NOT NULL column. Let us dive in and see how you can make the most of this powerful tool. To get more details on the bloat on an Azure Database for PostgreSQL flexible server, you can follow these steps: 1. Installing pgstattuple Add pgstattuple to the azure.extensions server parameter. You must install the extension on the database in which you want to analyze the bloat. To do so, connect to the database of interest and run the following command: CREATE EXTENSION pgstattuple; 2. Analyze the table/index Once the extension is installed, you can use the pgstattuple function to gather the detailed statistics to analyze the bloat on test table. The function provides information such as the number of live tuples, dead tuples, and the percentage of bloat, free space within the pages. To showcase the pgstattuple extension features, I have used a 4 Vcore SKU with PG version 16, created a test table with an index, loaded it with 24 Gb data and kept on generating bloat by performing some update/delete commands on the table leading to bloat. Creating a test table using the below script. CREATE TABLE test_table ( id bigserial PRIMARY KEY, column1 text, column2 text, column3 text, column4 text, column5 text, column6 text, column7 text, column8 text, column9 text, column10 text); Loading the table with 24 GB data using the below script. INSERT INTO test_table (column1, column2, column3, column4, column5, column6, column7, column8, column9, column10) SELECT md5(random()::text), md5(random()::text), md5(random()::text), md5(random()::text), md5(random()::text), md5(random()::text), md5(random()::text), md5(random()::text), md5(random()::text), md5(random()::text) FROM generate_series(1, 22000000); Create an index on this table to depict the usage of pgstatindex function to analyze bloat on an index. CREATE INDEX idx_column1 ON test_table (column1); Run functions pgstattuple on the above table and pgstatindex on the above index without bloat. pgstattuple on unbloated table You can observe the dead_tuple_percent is 0 and free_percent is 4.3 referring to the healthy state of a table. pgstatindex on unbloated index You can observe the avg_leaf_density is 89.07% and leaf_fragementation is 8.1 referring to healthy state of the index. Using the command below, I am generating bloat on the table. UPDATE test_table SET column1 = md5(random()::text), column2 = md5(random()::text), column3 = md5(random()::text), column4 = md5(random()::text), column5 = md5(random()::text), column6 = md5(random()::text) WHERE id % 5 = 0; You can analyze a table using the below query: SELECT * FROM pgstattuple('your_table_name'); What it does: It performs a full scan to gather detailed statistics Performance: On large tables the performance is slower and might take some seconds to minutes depending on the table size due to the full table scans Use case: To diagnose bloat or planning vacuuming strategies like (performing VACUUM/VACUUM FULL) To achieve faster estimates you can use, SELECT * FROM pgstattuple_approx(‘your_table_name’); What it does: Uses sampling to estimate statistics about the table. Accuracy: Results are close but not exact Performance: On large tables the performance is faster as it only considers a sample (subset of pages). Use case: quick insights The function provides the following information about the table. Column Description table_len Table length in bytes (tuple_len+dead_tuple_len+frees_space) and the overhead accounts for the padding (for tuple alignment) and page header (for per page table pointers) tuple_count Number of live tuples tuple_len Length of live tuples in bytes tuple_percent Percentage of live tuples dead_tuple_count Number of dead tuples dead_tuple_len Length of dead tuples in bytes dead_tuple_percent Percentage of dead tuples free_space Total free space in bytes within the pages free_percent Percentage of free space within the pages You should mainly concentrate on the below 3 columns to understand the table bloat and the unused space. dead_tuple_percent column tells us the percentage of dead tuples in the table. It is calculated as below. dead_tuple_percent = dead_tuple_len / table_len * 100 This can be reduced by running VACUUM on the table. However, VACUUM does not reclaim the space. Hence, free_space and free_percent would increase after the VACUUM. free_space and free_percent depict the unused/wasted space within the pages. The space can be reclaimed only by performing a VACUUM FULL on the table. If you see high free_percent it depicts the table needs VACUUM FULL (here instead you could use pg_repack) to reclaim the space. If you observe a dead_tuple_percent anything > 20% you would need to run VACUUM on the table. However, if you observe a free_percent > 50% you would need to run VACUUM FULL on the table. The below 3 snips depict the pgstattuple function run on a bloated table, vacuumed table and the output after pg_repack run on the table. Pgstattuple function run on the bloated table. pgstattuple function run on the table after vacuuming the table. Note: Here you also see a difference in tuple_count as I have performed some delete statements which are not captured in the document. Hence you see a tuple count difference. pgstattuple function on after running pg_repack on the table. Summary of Changes with vacuum and pg_repack run on a bloated table: dead_tuple_count: Reduced to 0 after VACUUM dead_tuple_len: Reduced to 0 after VACUUM dead_tuple_percent: Reduced after VACUUM from 21% to 0 free_space: Increased after VACUUM but significantly reduced after pg_repack free_percent: Increased after VACUUM but drastically reduced after pg_repack Similarly, to analyze an index, you can run: SELECT * FROM pgstatindex('your_index_name'); The function provides the following information about the index. Column Description version B-tree version number tree_level Tree level of the root page index_size Total number of pages in index root_block_no Location of root block internal_pages Number of "internal" (upper-level) pages leaf_pages Number of leaf pages empty_pages Number of empty pages deleted_pages Number of deleted pages avg_leaf_density Average density of leaf pages leaf_fragmentation Leaf page fragmentation Low avg_leaf_density implies underutilized pages. It denotes the percentage of good data in index pages. After VACUUM this column value would further go down as cleaning the dead tuples in the indexes reduces the leaf density further pointing us to increase in unused/wasted space. To reclaim the unused/wasted space REINDEX needs to be performed for the index to be performant. High leaf_fragmentation implies Poor data locality within the pages again a REINDEX would help. If you see avg_leaf_density anything < 20% you would need perform REINDEX. The below 3 snips depict the pgstatindex function run on a bloated table index, vacuumed table index and the output after pg_repack run on the table index. pgstatindex ran on bloated table pgstatindex run on index after Vacuuming the table pgstatindex run on index after pg_repack run Summary of Changes with vacuum and pg_repack run on a bloated index: index_size: Remained high after VACUUM but significantly reduced after pg_repack as the unused/wasted space is reclaimed avg_leaf_density: Reduced significantly after VACUUM from 80%-14% depicting a smaller number of good data on the leaf page and increased to 89% after pg_repack leaf_fragmentation: Remained the same after VACUUM and reduced to 0 after pg_repack depicting no page fragmentation happening Note: The index size is a sum of leaf_pages, empty_pages, deleted_pages and internal_pages. pgstattuple acquires a read lock on the object (table/index). So the results do not reflect an instantaneous snapshot; concurrent updates will affect them. For more information on pgstatginindex and pgstathasindex functions refer to PostgreSQL documentation here. For more insights on pgstattuple with respect to TOAST tables, please refer to the relevant documentation here. 3. Analyze bloat on partition tables The pgstattuple extension in PostgreSQL is a powerful tool for analyzing table and index bloat by providing detailed statistics such as dead tuple percentage and free space. However, it’s important to note that this function cannot be executed directly on a partitioned table. Instead, it must be run individually on each partition to gather meaningful insights. To streamline this process, especially when dealing with a large number of partitions, you can use a PL/pgSQL script that iterates through all partitions of a parent table. This script executes pgstattuple on each partition and stores the resulting statistics—such as dead tuple percentage and free space—in a summary table for easy review and analysis. This approach not only simplifies the task of identifying bloat across partitions but also enables proactive monitoring and optimization of storage efficiency in partitioned PostgreSQL environments. -- Create a temporary table to store bloat statistics DROP TABLE IF EXISTS bloat_summary; CREATE TEMP TABLE bloat_summary ( partition_name TEXT, table_len BIGINT, tuple_count BIGINT, tuple_len BIGINT, tuple_percent NUMERIC, dead_tuple_count BIGINT, dead_tuple_len BIGINT, dead_tuple_percent NUMERIC, free_space BIGINT, free_percent NUMERIC ); -- DO block to iterate over all partitions and collect statistics DO $$ DECLARE part RECORD; stats RECORD; BEGIN FOR part IN SELECT inhrelid::regclass AS partition_name FROM pg_inherits WHERE inhparent = 'your_parent_table'::regclass LOOP EXECUTE format('SELECT * FROM pgstattuple(%L)', part.partition_name) INTO stats; INSERT INTO bloat_summary VALUES ( part.partition_name, stats.table_len, stats.tuple_count, stats.tuple_len, stats.tuple_percent, stats.dead_tuple_count, stats.dead_tuple_len, stats.dead_tuple_percent, stats.free_space, stats.free_percent ); END LOOP; END $$; -- Output the summary SELECT * FROM bloat_summary ORDER BY dead_tuple_percent DESC; 4. Addressing the bloat Once you have identified a bloat, you can address it by taking the following steps. Below are common approaches. VACUUM: Clears dead tuples without reclaiming the space. Pg_repack: Performs VACUUM FULL and REINDEX online and efficiently reorganizes the data. Note: Other unused space, like the one left in heap or index pages due to a configured fill factor lower than 100 or because the remaining available space on a page cannot accommodate a row, given its minimum size, is not considered bloat, while it's also unused space. pgstattuple will help you address bloat! This blog post guided you through understanding and managing bloat in PostgreSQL using the pgstattuple extension. By leveraging this tool, you were able to gain detailed insights into the extent of bloat within their tables and indexes. These insights prove valuable in maintaining efficient storage and ensuring optimal server performance.1.5KViews2likes0Comments