postgresql
202 TopicsJanuary 2026 Recap: Azure Database for PostgreSQL
We just dropped the šš®š»šš®šæš š®š¬š®š² šæš²š°š®š½ for Azure Database for PostgreSQL and this oneās all about developer velocity, resiliency, and production-ready upgrades. January 2026 Recap: Azure Database for PostgreSQL ⢠PostgreSQL 18 support via Terraform (create + upgrade) ⢠Premium SSD v2 (Preview) with HA, replicas, Geo-DR & MVU ⢠Latest PostgreSQL minor version releases ⢠Ansible module GA with latest REST API features ⢠Zone-redundant HA now configurable via Azure CLI ⢠SDKs GA (Go, Java, JS, .NET, Python) on stable APIs Read the full January 2026 recap here and see whatās new (and whatās coming) - January 2026 Recap: Azure Database for PostgreSQLSupporting ChatGPT on PostgreSQL in Azure
Affan Dar, Vice President of Engineering, PostgreSQL at Microsoft Adam Prout, Partner Architect, PostgreSQL at Microsoft Panagiotis Antonopoulos, Distinguished Engineer, PostgreSQL at Microsoft The OpenAI engineering team recently published a blog post describing how they scaled their databases by 10x over the past year, to support 800 million monthly users. To do so, OpenAI relied on Azure Database for PostgreSQL to support important services like ChatGPT and the Developer API. Collaborating with a customer experiencing rapid user growth has been a remarkable journey. One key observation is that PostgreSQL works out of box for very large-scale points. As many in the public domain have noted, ChatGPT grew to 800M+ users before OpenAI started moving new and shardable workloads to Azure Cosmos DB. Nevertheless, supporting the growth of one of the largest Postgres deployments was a great learning experience for both of our teams. Our OpenAI friends did an incredible job at reacting fast and adjusting their systems to handle the growth. Similarly, the Postgres team at Azure worked to further tune the service to support the increasing OpenAI workload. The changes we made were not limited to OpenAI, hence all our Azure Database for PostgreSQL customers with demanding workloads have benefited. A few of the enhancements and the work that led to these are listed below. Changing the network congestion protocol to reduce replication lag Azure Database for PostgreSQL used the default CUBIC congestion control algorithm for replication traffic to replicas both within and outside the region. Leading up to one of the OpenAI launch events, we observed that several geo-distributed read replicas occasionally experienced replication lag. Replication from the primary server to the read replicas would typically operate without issues; however, at times, the replicas would unexpectedly begin falling behind the primary for reasons that were not immediately clear. This lag would not recover on its own and would grow to a point when, eventually, automation would restart the read replica. Once restarted, the read replica would once again catch up, only to repeat this cycle again within a day or less. After an extensive debugging effort, we traced the root cause to how the TCP congestion control algorithm handled a higher rate of packet drops. These drops were largely a result of high point-to-point traffic between the primary server and its replicas, compounded by the existing TCP window settings. Packet drops across regions are not unexpected; however, the default congestion control algorithm (CUBIC) treats packet loss as a sign of congestion and does an aggressive backoff. In comparison, the Bottleneck Bandwidth and Round-trip propagation time (BBR) congestion control algorithm is less sensitive to packet drops. Switching to BBR, adding SKU specific TCP window settings, and switching to fair queuing network discipline (which can control pacing of outgoing packets at hardware level) resolved this issue. Weāll also note that one of our seasoned PostgreSQL committers provided invaluable insights during this process, helping us pinpoint the issue more effectively. Scaling out with Read replicas PostgreSQL primaries, if configured properly, work amazingly well in supporting a large number of read replicas. In fact, as noted in the OpenAI engineering blog, a single primary has been able to power around 50+ replicas across multiple regions. However, going beyond this increases the chance of impacting the primary. For this reason, we added the cascading replica support to scale out reads even further. But this brings in a number of additional failure modes that need to be handled. The system must carefully orchestrate repairs around lagging and failing intermediary nodes, safely repointing replicas to new intermediary nodes while performing catch up or rewind in a mission critical setup. Furthermore, disaster recovery (DR) scenarios can require a fast rebuild of a replica and as data movement across regions is a costly and time-consuming operation, we developed the ability to create a geo replica from a snapshot of another replica in the same region. This feature avoids the traditional full data copy process, which may take hours or even days depending on the size of the data, by leveraging data for that cluster that already exists in that region. This feature will soon be available for all our customers as well. Scaling out Writes These improvements solved the read replica lag problems and read scale but did not help address the growing write scale for OpenAI. At some point, the balance tipped and it was obvious that the IOPs limits of a single PostgreSQL primary instance will not cut it anymore. As a result OpenAI decided to move new and shardable workloads to Azure Azure Cosmos DB, which is our default recommended NoSQL store for fully elastic workloads. However, some workloads, as noted in the OpenAI blog are much harder to shard. While OpenAI is using Azure Database for PostgreSQL flexible server, several of the write scaling requirements that came up have been baked into our new Azure HorizonDB offering, which entered private preview in November 2025. Some of the architectural innovations are described in the following sections. Azure HorizonDB scalability design To better support more demanding workloads, Azure HorizonDB introduces a new storage layer for Postgres that delivers significant performance and reliability enhancements: More efficient read scale out. Postgres read replicas no longer need to maintain their own copy of the data. They can read pages from the single copy maintained by the storage layer. Lower latency Write-Ahead Logging (WAL) writes and higher throughput page reads via two purpose-built storage services designed for WAL storage and Page storage. Durability and high availability responsibilities are shifted from the Postgres primary to the storage layer, allowing Postgres to dedicate more resources to executing transactions and queries. Postgres failovers are faster and more reliable. To understand how Azure HorizonDB delivers these capabilities, letās look at its highālevel architecture as shown in FigureāÆ1. It follows a log-centric storage model, where the PostgreSQL writeahead log (WAL) is the sole mechanism used to durably persist changes to storage. PostgreSQL compute nodes never write data pages to storage directly in Azure HorizonDB. Instead, pages and other on-disk structures are treated as derived state and are reconstructed and updated from WAL records by the data storage fleet. Azure HorizonDB storage uses two separate storage services for WAL and data pages. This separation allows each to be designed and optimized for the very different patterns of reads and writes PostgreSQL does against WAL files in contrast to data pages. The WAL server is optimized for very low latency writes to the tail of a sequential WAL stream and the Page server is designed for random reads and writes across potentially many terabytes of pages. These two separate services work together to enable Postgres to handle IO intensive OLTP workloads like OpenAIās. The WAL server can durably write a transaction across 3 availability zones using a single network hop. The typical PostgreSQL replication setup with a hot standby (Figure 2) requires 4 hops to do the same work. Each hop is a component that can potentially fail or slow down and delay a commit. Azure HorizonDB page service can scale out page reads to many hundreds of thousands of IOPs for each Postgres instance. It does this by sharding the data in Postgres data files across a fleet of page servers. This spreads the reads across many high performance NVMe disks on each page server. 2 - WAL Writes in HorizonDB Another key design principle for Azure HorizonDB was to move durability and high availability related work off PostgreSQL compute allowing it to operate as a stateless compute engine for queries and transactions. This approach gives Postgres more CPU, disk and network to run your applicationās business logic. Table 1 summarizes the different tasks that community PostgreSQL has to do, which Azure HorizonDB moves to its storage layer. Work like dirty page writing and checkpointing are no longer done by a Postgres primary. The work for sending WAL files to read replicas is also moved off the primary and into the storage layer ā having many read replicas puts no load on the Postgres primary in Azure HorizonDB. Backups are handled by Azure Storage via snapshots, Postgres isnāt involved. Task Resource Savings Postgres Process Moved WAL sending to Postgres replicas Disk IO, Network IO Walsender WAL archiving to blob storage Disk IO, Network IO Archiver WAL filtering CPU, Network IO Shared Storage Specific (*) Dirty Page Writing Disk IO background writer Checkpointing Disk IO checkpointer PostgreSQL WAL recovery Disk IO, CPU startup recovering PostgreSQL read replica redo Disk IO, CPU startup recovering PostgreSQL read replica shared storage Disk IO background, checkpointer Backups Disk IO pg_dump, pg_basebackup, pg_backup_start, pg_backup_stop Full page writes Disk IO Backends doing WAL writing Hot standby feedback Vacuum accuracy walreceiver Table 1 - Summary of work that the Azure HorizonDB storage layer takes over from PostgreSQL The shared storage architecture of Azure HorizonDB is the fundamental building block for delivering exceptional read scalability and elasticity which are critical for many workloads. Users can spin up read replicas instantly without requiring any data copies. Page Servers are able to scale and serve requests from all replicas without any additional storage costs. Since WAL replication is entirely handled by the storage service, the primaryās performance is not impacted as the number of replicas changes. Each read replica can scale independently to serve different workloads, allowing for workload isolation. Finally, this architecture allows Azure HorizonDB to substantially improve the overall experience around high availability (HA). HA replicas can now be added without any data copying or storage costs. Since the data is shared between the replicas and continuously updated by Page Servers, secondary replicas only replay a portion of the WAL and can easily keep up with the primary, reducing failover times. The shared storage also guarantees that there is a single source of truth and the old primary never diverges after a failover. This prevents the need for expensive reconciliation, using pg_rewind, or other techniques and further improves availability. Azure HorizonDB was designed from the ground up with learnings from large scale customers, to meet the requirements of the most demanding workloads. The improved performance, scalability and availability of the Azure HorizonDB architecture make Azure a great destination for Postgres workloads.1.5KViews9likes0CommentsAzure PostgreSQL Lesson Learned #14: Hitting the Max Storage Limits Blocking Further ScaleāUp
Coāauthored with HaiderZ-MSFTā Case Overview A customer attempted to increase storage for their Azure Database for PostgreSQL Flexible Server but was unable to exceed 32āÆTiB. Investigation confirmed that the server was deployed in a region where the maximum supported storage is capped at 32āÆTiB, meaning no additional scaleāup was possible. This limitation required exploring alternative storage options and potential redeployment strategies to support growing workload demands. Symptoms: How the Problem Appears Storage size maxes out at 32āÆTiB Portal does not allow selecting any higher value This behavior is expected because the platform enforces the maximum storage supported by the region and tier. Root Cause: Regional Storage Limit of 32āÆTiB Reached Azure Database for PostgreSQL Flexible Server supports up to 32āÆTiB of storage on Premium SSD within the customerās region. Once this limit is reached: Further storage growth is not possible Storage cannot be downgraded or changed in-place The customer must migrate to a tier that supports larger disks (e.g., Premium SSD v2, where available) Premium SSD v2 supports: Up to 64āÆTiB 1āÆGiB granular sizing Higher throughput and IOPS However, its availability and supported capabilities can vary by region. StepāByāStep Troubleshooting & Migration Guidance STEP 1 ā Validate Current Storage Azure Portal ā Server ā Compute + Storage Confirms storage is already at the maximum the region can offer STEP 2 ā Confirm Tier Limitations Check documentation for storage caps by tier and region from Storage options | Microsoft Learn STEP 3 ā Attempt a PITRāBased Redeployment Migration from Premium SSD ā Premium SSD v2 is possible via following these steps: https://learn.microsoft.com/en-us/azure/postgresql/compute-storage/concepts-storage-migrate-ssd-to-ssd-v2?tabs=portal-restore-custom-point ā Important: The PITR wizard always deploys the restored server in the same region as the source server. There is no option to change regions during PITR. Therefore, customers must check if Premium SSD v2 becomes selectable during PITR within that same region. If SSD v2 is not listed in the dropdown, it means: The region does not support SSD v2 for PostgreSQL Flexible Server The customer cannot exceed 32āÆTiB on Flexible Server in that region Final Outcome The server reached the maximum supported storage (32āÆTiB) for its region and storage could not be increased further. To exceed this limit, the customer needs to move to Premium SSD v2 (if supported) Migration must be done via following these steps: https://learn.microsoft.com/en-us/azure/postgresql/compute-storage/concepts-storage-migrate-ssd-to-ssd-v2?tabs=portal-restore-custom-point Tip: Alternative Path ā Use Dump & Restore to a New Server With Larger Storage If Premium SSD v2 does not appear as an available storage option during the Migration from Premium SSD ā Premium SSD v2 workflow, our customer still has another viable path to exceed the 32āÆTiB limit: ā” Option: Perform a pg_dump / pg_restore to a New Server with Higher Storage Capacity Data can be migrated by creating a brandānew Flexible Server in a region that supports higher storage tiers (including Premium SSD v2), then migrating the data using standard PostgreSQL backup tools Best Practices Add proactive storage alerts (70%, 80%, 90%). Validate regional storage limits before provisioning servers. Architect for growth by selecting a region/tier that aligns with future capacity needs. Request Premium SSD v2 quota increases early when planning large workloads. Helpful References Premium SSD v2 for Azure PostgreSQL Flexible Server https://learn.microsoft.com/azure/postgresql/compute-storage/concepts-storage-premium-ssd-v2 How to Migrate from Premium SSD ā Premium SSD v2 (PITR) https://learn.microsoft.com/en-us/azure/postgresql/compute-storage/concepts-storage-migrate-ssd-to-ssd-v2?tabs=portal-restore-custom-point147Views0likes0CommentsFrom Oracle to Azure: How Quadrant Technologies accelerates migrations
This blog was authored by Manikyam Thukkapuram, Director, Alliances & Engineering at Quadrant Technologies; and Thiwagar Bhalaji, Migration Engineer and DevOps Architect at Quadrant Technologies Over the past 20+ years, Quadrant Technologies has accelerated database modernization for hundreds of organizations. As momentum to the cloud continues to grow, a major focus for our business has been migrating on-premises Oracle databases to Azure. Weāve found that landing customers in Azure Database for PostgreSQL has been the best option both in terms of cost savings and efficiency. Azure Migrate is by far the best way to get them there. With Azure Migrate, weāre able to streamline migrations that traditionally took months, into weeks. As a Microsoft solutions partner, we help customers migrate to Azure and develop Azure-based solutions. Weāre known as āthe great modernization specialistsā because many of our customers come to us with complex legacy footprints, outdated infrastructure, and monolithic applications that can be challenging to move to the cloud. But we excel at untangling these complex environments. And with our Q-Migrator tool, which is a wrapper around Azure Migrate, weāre able to automate and accelerate these kinds of migrations. Manual steps slowed down timelines In general, each migration we lead includes a discovery phase, a compatibility assessment, and the migration execution. In discovery, we identify every server, database, and application in a customerās environment and map their interactions. Next, we assess each assetās readiness for Azure and plan for optimal cloud configurations. Finally, we bring the plan to life, integrating applications, moving workloads, and validating performance. Before adopting Azure Migrate, each of these phases involved manual tasks for our team. During our discovery process we manually collected inventory and wrote custom scripts to track server relationships and database dependencies. Our engineers also had to dig through configuration files and use third-party assessment tools for aspects like VM utilization and Oracle schema. When we mapped compatibility, we worked from static data to predict cost estimates and sizing, as opposed to operating from real-time telemetry. By the time we reached the migration phase, fragmented tooling and inconsistent assessments made it difficult to maintain accuracy and efficiency. Hidden dependencies sometimes surfaced late in the process, causing unexpected rework and delays. Streamlining migrations with Azure Migrate To automate and streamline these manual tasks, we developed Q-Migrator, which is our in-house framework built around Azure Migrate. Now we can offer clients an efficient, agentless approach to discovery, assessment, and migration. As part of our on-premises database migration initiatives, we rely on Azure Migrate to seamlessly migrate a wide range of structured databases (including MySQL, Microsoft SQL Server, PostgreSQL, and Oracle) from on-premises environments to Azure IaaS and PaaS. For instance, for an on-premises PostgreSQL migration, we begin by setting up an Azure Migrate appliance in the clientās environment to automatically discover servers, databases, and applications. That generates a complete inventory and dependency map that identifies every relationship between servers and databases. From there, we run an assessment through Azure Migrate to check compatibility, identify blockers, and right-size target environments for Azure Database for PostgreSQL. By integrating Azure Database Migration Service (DMS), we can replicate data continuously until cutover, ensuring near-zero downtime. In addition, Azure DMS provides robust telemetry and analytics for deep visibility into every stage of the process. This unified and automated workflow not only replaces manual steps but also increases reliability and accelerates delivery. Teams benefit from a consolidated dashboard for planning, execution, and performance tracking, driving efficiency throughout the migration lifecycle. 75% faster deployment, 60% cost savings Since implementing Azure Migrate, which now facilitates discovery and assessment for on-premises PostgreSQL workloads, weāve accelerated deployment by 75% compared to traditional migration methods. Weāve also reduced costs for our clients by up to 60 percent. Automated discovery alone reduces that phase by nearly 40%, and dependency mapping now takes a fraction of the effort. With the integrated dashboard in Azure Migrate we can also track progress across discovery, assessment, and migration in one place. This eliminates the need for multiple third-party tools. These efficiencies allow us to deliver complex migrations on tighter timelines without sacrificing quality or reliability. Rounding out the modernization journey with AKS As āthe great modernization specialists,ā weāre often asked which is the best database for landing Oracle workloads in the cloud. From our experience, Azure Database for PostgreSQL is ideal for enterprises seeking cost-efficient and secure PostgreSQL deployments. Its managed services reduce operational overhead while maintaining high availability, compliance, and scalability. Plus, seamless integration with Azure AI services allows us to innovate for clients and keep them ahead of the curve. We also recognize that database migration is only the first step for many clientsāmodernizing the application layer delivers even greater scalability, security, and manageability. When clients come to Quadrant for a broader modernization strategy, we often use Azure Kubernetes Service (AKS) to containerize their applications and break monoliths into microservices. AKS delivers a cloud-native architecture alongside database modernization. This integration supports DevOps practices, simplifies deployments, and allows customers to take full advantage of elastic cloud infrastructure. More innovation to come Overall, Azure Migrate and Azure Database for PostgreSQL, Azure Database for MySQL, and Azure SQL Database have redefined how we deliver database modernization, and our close collaboration with Microsoft has made it possible. By engaging early with Microsoft, we can validate migration architectures and gain insights into best practices for high-performance and secure cloud deployments. Access to Microsoft experts helps us fine-tune our designs, optimize performance, and resolve complex issues quickly. Weāre also investing in AI-driven automation using Azure OpenAI in Foundry Models to analyze migration data, optimize queries, and predict performance outcomes. These innovations allow us to deliver more intelligent, adaptive solutions tailored to each customerās unique environment.311Views2likes0CommentsPostgreSQL for the enterprise: scale, secure, simplify
This week at Microsoft Ignite, along with unveiling the new Azure HorizonDB cloud native database service, weāre announcing multiple improvements to our fully managed open-source Azure Database for PostgreSQL service, delivering significant advances in performance, analytics, security, and AI-assisted migration. Letās walk through nine of the top Azure Database for PostgreSQL features and improvements weāre announcing at Microsoft Ignite 2025. Feature Highlights New Intel and AMD v6-series SKUs (Preview) Scale to multiple nodes with Elastic Clusters (GA) PostgreSQL 18 (GA) Realtime analytics with Fabric Mirroring (GA) Analytical queries inside PostgreSQL with the pg_duckdb extension (Preview) Adding Parquet to the azure_storage extension (GA) Meet compliance requirements with the credcheck, anon & ip4r extensions (GA) Integrated identity with Entra token-refresh libraries for Python AI-Assisted Oracle to PostgreSQL Migration Tool (Preview) Performance and scale New Intel and AMD v6 series SKUs (Preview) You can run your most demanding Postgres workloads on new Intel and AMD v6 General Purpose and Memory Optimized hardware SKUs, now availble in preview These SKUs deliver massive scale for high-performance OLTP, analytics and complex queries, with improved price performance and higher memory ceilings. AMD Confidential Compute v6 SKUs are also in Public Preview, enabling enhanced security for sensitive workloads while leveraging AMDās advanced hardware capabilities. Hereās what you need to know: Processors: Powered by 5th Gen IntelĀ® XeonĀ® processor (code-named Emerald Rapids) and AMD's fourth Generation EPYC⢠9004 processors Scale: VM size options scale up to 192 vCores and 1.8 TiB IO: Using the NVMe protocol for data disk access, IO is parallelized to the number of CPU cores and processed more efficiently, offering significant IO improvements Compute tier: Available in our General Purpose and Memory Optimized tiers. You can scale up to these new compute SKUs as needed with minimal downtime. Learn more: Here's a quick summary of the v6 SKUs weāre launching, with links to more information: Processor SKU Max vCores Max Mem Intel Ddsv6 192 768 GiB Edsv6 192 1.8 TiB AMD Dadsv6 96 384 GiB Eadsv6 96 672 GiB DCadsv6 96 386 GiB ECadsv6 96 672 GiB Scale to multiple nodes with Elastic clusters (GA) Elastic clusters are now generally available in Azure Database for PostgreSQL. Built on Citus open-source technology, elastic clusters bring the horizontal scaling of a distributed database to the enterprise features of Azure Database for PostgreSQL. Elastic clusters enable horizontal scaling of databases running across multiple server nodes in a āshared nothingā architecture. This is ideal for workloads with high-throughput and storage-intensive demands such as multi-tenant SaaS and IoT-based workloads. Elastic clusters come with all the enterprise-level capabilities that organizations rely upon in Azure Database for PostgreSQL, including high availability, read replicas, private networking, integrated security and connection pooling. Built-in sharding support at both row and schema level enables you to distribute your data across a cluster of compute resources and run queries in parallel, dramatically increasing throughput and capacity. Learn more: Elastic clusters in Azure Database for PostgreSQL PostgreSQL 18 (GA) When PostgreSQL 18 was released in September, we made a preview available on Azure on the same day. Now weāre announcing that PostgreSQL 18 is generally available on Azure Database for PostgreSQL, with full Major Version Upgrade (MVU) support, marking our fastest-ever turnaround from open-source release to managed service general availability. This release reinforces our commitment to delivering the latest PostgreSQL community innovations to Azure customers, so you can adopt the latest features, performance improvements, and security enhancements on a fully managed, production-ready platform without delay. ^Note: MVU to PG18 is currently available in the NorthCentralUS and WestCentralUS regions, with additional regions being enabled over the next few weeks Now you can: Deploy PostgreSQL 18 in all public Azure regions. Perform in-place major version upgrades to PG18 with no endpoint or connection string changes. Use Microsoft Entra ID authentication for secure, centralized identity management in all PG versions. Enable Query Store and Index Tuning for built-in performance insights and automated optimization. Leverage the 90+ Postgres extensions supported by Azure Database for PostgreSQL. PostgreSQL 18 also delivers major improvements under the hood, ranging from asynchronous I/O and enhanced vacuuming to improved indexing and partitioning, ensuring Azure continues to lead as the most performant, secure, and developer-friendly PostgreSQL managed service in the cloud. Learn more: PostgreSQL 18 open-source release announcement Supported versions of PostgreSQL in Azure Database for PostgreSQL Analytics Real-time analytics with Fabric Mirroring (GA) With Fabric mirroring in Azure Database for PostgreSQL, now generally available, you can run your Microsoft Fabric analytical workloads and capabilities on near-real-time replicated data, without impacting the performance of your production PostgreSQL databases, and at no extra cost. Mirroring in Fabric connects your operational and analytical platforms with continuous data replication from PostgreSQL to Fabric. Transactions are mirrored to Fabric in near real-time, enabling advanced analytics, machine learning, and reporting on live data sets without waiting for traditional batch ETL processes to complete. This approach eliminates the overhead of custom integrations or data pipelines. Production PostgreSQL servers can run mission-critical transactional workloads without being affected by surges in analytical queries and reporting. With our GA announcement Fabric mirroring is ready for production workloads, with secure networking (VNET integration and Private Endpoints supported), Entra ID authentication for centralized identity management, and support for high availability enabled servers, ensuring business continuity for mirroring sessions. Learn more: Mirroring Azure Database for PostgreSQL flexible server Adding Parquet support to the azure_storage extension (GA) In addition to mirroring data directly to Microsoft Fabric, there are many other scenarios that require moving operational data into data lakes for analytics or archival. The complexity of building and maintaining ETL pipelines can be expensive and time-consuming. Azure Database for PostgreSQL now natively supports Parquet via the azure_storage extension, enabling direct SQL-based read/write to Parquet files in Azure Storage. This makes it easy to import and export data in Postgres without external tools or scripts. Parquet is a popular columnar storage format often used in big data and analytics environments (like Spark and Azure Data Lake) because of its efficient compression and query performance for large datasets. Now you can use the azure_storage extension to can skip an entire step: just issue a SQL command to write to and query from a Parquet file in Azure Blob Storage. Learn more: Azure storage extension in Azure Database for PostgreSQL Analytical queries inside PostgreSQL with the pg_duckdb extension (Preview) DuckDBās columnar engine excels at high performance scans, aggregations and joins over large tables, making it particularly well-suited for analytical queries. The pg_duckdb extension, now available in preview for Azure Database for PostgreSQL combines PostgreSQLās transactional performance and reliability with DuckDBās analytical speed for large datasets. Together pg_duckdb and PostgreSQL are an ideal combination for hybrid OLTP + OLAP environments where you need to run analytical queries directly in PostgreSQL without sacrificing performance., To see the pg_duckdb extension in action check out this demo video: https://aka.ms/pg_duckdb Learn more: pg_duckdb ā PostgreSQL extension for DuckDB Security Meet compliance requirements with the credcheck, anon & ip4r extensions (GA) Operating in a regulated industry such as Finance, Healthcare and Government means negotiating compliance requirements like HIPAA and PCI-DSS, GDPR that include protection for personalized data and password complexity, expiration and reuse. This week the anon extension, previously in preview, is now generally available for Azure Database for PostgreSQL adding support for dynamic and static masking, anonymized exports, randomization and many other advanced masking techniques. Weāve also added GA support for the credcheck extension, which provides credential checks for usernames, and password complexity, including during user creation, password change and user renaming. This is particularly useful if your application is not using Entra ID and needs to rely on native PostgreSQL users and passwords. If you need to store and query IP ranges for scenarios like auditing, compliance, access control lists, intrusion detection and threat intelligence, another useful extension announced this week is the ip4r extension which provides a set of data types for IPv4 and IPv6 network addresses. Learn more: PostgreSQL Anonymizer credcheck ā PostgreSQL username/password checks IP4R - IPv4/v6 and IPv4/v6 range index type for PostgreSQL The Azure team maintains an active pipeline of new PostgreSQL extensions to onboard and upgrade to Azure Database for PostgreSQL For example, another important extension upgraded this week is pg_squeeze which removes unused space from a table. The updated 1.9.1 version adds important stability improvements. Learn more: List of extensions and modules by name Integrated identity with Entra token-refresh libraries for Python In a modern cloud-connected enterprise, identity becomes the most important security perimeter. Azure Database for PostgreSQL is the only managed PostgreSQL service with full Entra integration, but coding applications to take care of Entra token refresh can be complex. This week weāre announcing a new Python library to simplify Entra token refresh. The library automatically refreshes authentication tokens before they expire, eliminating manual token handling and reducing connection failures. The new python_azure_pg_auth library provides seamless Azure Entra ID authentication and supports the latest psycopg and SQLAlchemy drivers with automatic token acquisition, validation, and refresh. Built-in connection pooling is available for both synchronous and asynchronous workloads. Designed for cross-platform use (Windows, Linux, macOS), the package features clean architecture and flexible installation options for different driver combinations. This is our first milestone in a roadmap to add token refresh for additional programming languages and frameworks. Learn more, with code samples to get started here: https://aka.ms/python-azure-pg-auth Migration AI-Assisted Oracle to PostgreSQL Migration Tool (Preview) Database migration is a challenging and time-consuming process, with multiple manual steps requiring schema and apps specific information. The growing popularity, maturity and low cost of PostgreSQL has led to a healthy demand for migration tooling to simplify these steps. The new AI-assisted Oracle Migration Tool preview announced this week greatly simplifies moving from Oracle databases to Azure Database for PostgreSQL. Available in the VS Code PostgreSQL extension the new migration tool combines GitHub Copilot, Azure OpenAI, and custom Language Model Tools to convert Oracle schema, database code and client applications into PostgreSQL-compatible formats. Unlike traditional migration tools that rely on static rules, Azureās approach leverages Large Language Models (LLMs) and validates every change against a running Azure Database for PostgreSQL instance. This system not only translates syntax but also detects and fixes errors through iterative re-compilation, flagging any items that require human review. Application codebases like Spring Boot and other popular frameworks are refactored and converted. The system also understands context by querying the target Postgres instance for version and installed extensions. It can even invoke capabilities from other VS Code extensions to validate the converted code. The new AI-assisted workflow reduces risk, eliminates significant manual effort, and enables faster modernization while lowering costs. Learn more: https://aka.ms/pg-migration-tooling Be sure to follow the Microsoft Blog for PostgreSQL for regular updates from the Postgres on Azure team at Microsoft. We publish monthly recaps about new features in Azure Database for PostgreSQL, as well as an annual blog about whatās new in Postgres at Microsoft.3KViews9likes0CommentsPerformance Tuning for CDC: Managing Replication Lag in Azure Database for PostgreSQL with Debezium
Written By: Shashikant Shakya, Ashutosh Bapat, and Guangnan Shi The Problem Picture this: your CDC pipeline is running smoothly, streaming changes from PostgreSQL to Kafka. Then, a bulk update hits millions of rows. Suddenly, Kafka queues pile up, downstream systems lag; dashboards go stale. Why does replication lag spike during heavy operations? And what can you do about it? Why This Matters Change Data Capture (CDC) powers real-time integrations, pushing row-level changes from OLTP systems into event streams, data lakes, caches, and microservices. Debezium is a leading open-source CDC engine for PostgreSQL, and many teams successfully run Debezium against Azure Database for PostgreSQL to keep downstream systems synchronized. However, during large DML operations (bulk updates, deletes) or schema changes (DDL), replication lag can occur because: Debezium consumes WAL slower than the database produces it Kafka throughput dips Consumers fall behind This article explains why lag happens, grounded in logical decoding internals, and shows how to diagnose it quickly and what to tune across the database, Azure, and connector layers to keep pipelines healthy under heavy load. CDC Basics CDC streams incremental changes (INSERT/UPDATE/DELETE) from your source database to downstream systems in near real-time. In PostgreSQL, CDC is typically implemented using logical decoding and logical replication: PostgreSQL records every change in the Write-Ahead Log (WAL) WALSender reads WAL and decodes it into change events The pgoutput extension formats those changes, while Debezium subscribes and publishes them to Kafka topics Benefits of CDC: Low latency Lower source overhead than periodic full extracts Preserves transactional ordering for consumers The Internals: Why Lag Happens Replication lag during heavy operations isnāt random, itās rooted in how PostgreSQL handles logical decoding. To understand why, letās look at the components that process changes and what happens when they hit resource limits. Logical Decoding & ReorderBuffer Logical decoding reconstructs transaction-level changes so they can be delivered in commit order. The core component enabling this is the ReorderBuffer. What ReorderBuffer does: Reads WAL and groups changes per transaction, keeping them in memory until commit If memory exceeds logical_decoding_work_mem , PostgreSQL spills decoded changes to disk in per-slot spill files On commit, it reads back spilled data and emits changes to the client (via pgoutput ā Debezium) Disk Spill Mechanics (Deep Dive) When a transaction is too large for memory: PostgreSQL writes decoded changes to spill files under pg_replslot/<slot_name>/ Wait events like ReorderBufferWrite and ReorderBufferRead dominate during heavy load Spills to disk increase latency because disk I/O is far slower than memory access Analogy: Think of ReorderBuffer as a warehouse staging floor: Small shipments move quickly in memory A huge shipment forces workers to move boxes offsite (spill-to-disk), then bring them back later, slowing everything down Why Multiple Slots Amplify the Impact The WAL is shared by all slots Each slot decodes the entire WAL stream because filtering happens after decoding Result: A single large transaction affects every slot, multiplying replication lag Recommendation: Minimize the number of slots/connectors Remember: logical_decoding_work_mem applies per slot, not globally Impact Snapshot: Scenario Spill Size I/O Impact 1 Slot 1 GB 1Ć I/O 5 Slots 1 GB Ć 5 5Ć I/O Lifecycle: WAL ā ReorderBuffer ā Memory ā Spill to Disk ā Read Back ā Send to Client How to Detect Spills and Lag Detection should be quick and repeatable. Start by confirming slot activity and LSN distance (how far producers are ahead of consumers), then check walsender wait events to see if decoding is stalling, and finally inspect per-slot spill metrics to quantify memory overflow to disk. 1. Active slots and lag Use this to measure how far each logical slot is behind the current WAL. A large lsn_distance indicates backlog. If restart_lsn is far behind, the server must retain more WAL on disk, increasing storage pressure. SELECT slot_name, active_pid, confirmed_flush_lsn, restart_lsn, pg_current_wal_lsn(), pg_size_pretty((pg_current_wal_lsn() - confirmed_flush_lsn)) AS lsn_distance FROM pg_replication_slots; Interpretation: Focus on slots with the largest lsn_distance . If active_pid is NULL, the slot isnāt currently consuming; investigate connector health or connectivity. 2. Wait events for walsender Check whether the WAL sender backends are stalled on decoding or I/O. ReorderBuffer-related waits typically point to spill-to-disk conditions or slow downstream consumption. SELECT pid, backend_type, application_name, wait_event FROM pg_stat_activity WHERE backend_type = 'walsender' ORDER BY backend_start; Interpretation: Frequent ReorderBufferWrite / ReorderBufferRead suggests large transactions are spilling. 3. Spill stats Quantify how often and how much each slot spills from memory to disk. Rising spill_bytes and spill_count during heavy DML are strong signals to increase logical_decoding_work_mem , reduce transaction size, or tune connector throughput. SELECT slot_name, spill_txns, spill_count, pg_size_pretty(spill_bytes) AS spill_bytes, total_txns, pg_size_pretty(total_bytes) AS total_bytes, stats_reset FROM pg_stat_replication_slots; Interpretation: Compare spill_bytes across slots; if many slots spill simultaneously, aggregate I/O multiplies. Consider reducing the number of active slots or batching large DML. Fixing the Lag: Practical Strategies Once youāve identified replication lag and its root causes, the next step is mitigation. Solutions span across the database configuration, Azure infrastructure, and the Debezium connector layer. These strategies aim to reduce I/O overhead, optimize memory usage, and ensure smooth data flow under heavy workloads. Database & Azure Layer At the database and infrastructure level, focus on reducing unnecessary overhead and ensuring resources are scaled for peak demand. Hereās what you can do: Avoid REPLICA IDENTITY FULL : prefer PRIMARY KEY; or add a unique index and set REPLICA IDENTITY USING INDEX Use appropriately scaled IO-capable storage / right SKU for higher IOPS Right-size logical_decoding_work_mem considering multiple slots Break up large DML: batch updates/deletes (10kā50k rows/commit) Schedule/throttle maintenance: stagger VACUUM/REINDEX/DDL Network placement: use Private Endpoint and co-locate Debezium/Kafka within the same region/VNet Debezium Connector Layer Connector-level tuning ensures that Debezium can keep pace with PostgreSQL WAL generation and Kafka throughput. Key adjustments include: Tune throughput & buffering: increase max.batch.size , max.queue.size , reduce poll.interval.ms Offset flush tuning: reduce offset.flush.interval.ms Heartbeats: introduce heartbeat events to detect staleness and prevent WAL buildup Conclusion Managing replication lag in Azure Database for PostgreSQL with Debezium isnāt just about tweaking parameters; itās about understanding logical decoding internals, anticipating workload patterns, and applying proactive strategies across the entire solution. Key Takeaways: Monitor early, act fast: Use diagnostic queries to track lag, wait events, and spill activity Minimize complexity: Fewer replication slots and well-tuned connectors reduce redundant work Plan for scale: Batch large DML operations, right-size memory settings Leverage Azure capabilities: Optimize IOPS tiers, network placement for predictable performance By combining these best practices with continuous monitoring and operational discipline, you can keep your CDC pipelines healthy, even under heavy load, while ensuring downstream systems stay in sync with minimal latency. Further Reading Azure Database for PostgreSQL Flexible Server Overview PostgreSQL Logical Replication Debezium PostgreSQL ConnectorNovember 2025 Recap: PostgreSQL on Azure
Hello Azure Community, November was an exciting month for PostgreSQL on Azure, packed with announcements at Microsoft Ignite 2025. In this recap, weāll walk you through the highlights from features recaps to deep-dive sessions so you can catch up on everything you might have missed. If you couldnāt join us live, here are some of the key sessions now available on demand: Modern data modern apps: Innovation with Microsoft Databases AI-assisted migration: The path to powerful performance on PostgreSQL Azure HorizonDB: Deep Dive into a New Enterprise-Scale PostgreSQL The blueprint for intelligent AI agents backed by PostgreSQL Nasdaq Boardvantage: AI-driven governance on PostgreSQL and Microsoft Foundry We also introduced major updates, including Azure HorizonDB preview with AI capabilities and new features for Azure Database for PostgreSQL that make migrations faster, deployments smarter, and performance more predictable. The blog is organized into the following sections: Azure HorizonDB (Preview) Azure Database for PostgreSQL feature announcements Azure HorizonDB: AI features & developer tools Photo Gallery from Ignite Azure HorizonDB (Preview) If itās not obvious, the introduction of Azure HorizonDB is a big deal. This brand-new, fully managed PostgreSQL service is built for mission-critical workloads and modern AI applications, bringing cloud-native scale, ultra-low latency, and deep Azure integration in one powerful offering. Here are some of the features that we offer with Azure HorizonDB: Scale-out compute architecture supporting up to 3,072 vCores across primary and replica nodes. Auto-scaling shared storage that handles databases up to 128āÆTB, while achieving sub-millisecond multi-zone commit latencies. Breakthrough throughput up to 3Ć higher than open-source PostgreSQL for transactional workloads, powered by our storage innovations. Learn more about Azure HorizonDB in our detailed blog. Azure Database for PostgreSQL feature announcements We introduced a wave of new capabilities focusing on performance, analytics, security and AI-assisted migration for Azure Database for PostgreSQL. Among the key general availability announcements were PostgreSQL 18, Fabric mirroring, Elastic clusters, and support for Parquet in the azure_storage extension. We also unveiled several exciting preview features, including Intel and AMD v6-series SKUs, the pg_duckdb extension, and enhanced tooling for Oracle-to-PostgreSQL migrations. All these updates are captured in our blog post explore the full list and learn more. Azure HorizonDB: AI features & developer tools Azure HorizonDB isnāt just built for enterprise-scale workloads itās also designed to power next-generation AI applications. At Ignite, we introduced advanced AI capabilities including DiskANN with Advanced Filtering, built-in AI model management, and Microsoft Foundry integration. DiskANN Advanced Filtering reduces query latency by up to 3Ć, depending on filter complexity. AI Model Management enables developers to set up semantic operators directly within the PostgreSQL environment, simplifying AI workflows. Microsoft Foundry Integration adds a PostgreSQL connector, allowing Foundry agents to interact with HorizonDB securely using natural language instead of SQL. General Availability of PostgreSQL extension for VS Code We announced the general availability of the PostgreSQL extension for VS Code, making development faster and more intuitive. The PostgreSQL extension for VS code has now over 300K downloads from the Visual Studio Marketplace! This extension makes it easier for developers to seamlessly interact with any PostgreSQL databases. To learn more about these AI features in Azure HorizonDB, check out our blog post. Photo Gallery from Microsoft Ignite Ignite 2025 brought a lot of great sessions, announcements, and hands-on demos. Hereās a quick photo recap of some key moments from technical deep dives to product launches to hearing real world impact from our amazing customer speakers. POSETTE CFP Now Open We are excited to announce that the Call for Proposals (CFP) for POSETTE: An Event for Postgres 2026 is now open! Weāre inviting speakers, practitioners, educators, and community contributors to share their knowledge through talks and demos. If youāre passionate about PostgreSQL, open-source innovation, or building resilient data systems, weād love to see your submission. CFP Link: https://posetteconf.com/2026/cfp/573Views3likes0Comments