Azure Data
11 TopicsAugust 2025 Recap: Azure Database for PostgreSQL
Hello Azure Community, August was an exciting month for Azure Database for PostgreSQL! We have introduced updates that make your experience smarter and more secure. From simplified Entra ID group login to integrations with LangChain and LangGraph, these updates help with improving access control and seamless integration for your AI agents and applications. Stay tuned as we dive deeper into each of these feature updates. Feature Highlights Enhanced Performance recommendations for Azure Advisor - Generally Available Entra-ID group login using user credentials - Public Preview New Region Buildout: Austria East LangChain and LangGraph connector Active-Active Replication Guide Enhanced Performance recommendations for Azure Advisor - Generally Available Azure Advisor now offers enhanced recommendations to further optimize PostgreSQL server performance, security, and resource management. These key updates are as follows: Index Scan Insights: Detection and recommendations for disabled index and index-only scans to improve query efficiency. Audit Logging Review: Identification of excessive logging via the pgaudit.log parameter, with guidance to reduce overhead. Statistics Monitoring: Alerts on server statistics resets and suggestions to restore accurate performance tracking. Storage Optimization: Analysis of storage usage with recommendations to enable the Storage Autogrow feature for seamless scaling. Connection Management: Evaluation of workloads for short-lived connections and frequent connectivity errors, with recommendations to implement PgBouncer for efficient connection pooling. These enhancements aim to provide deeper operational insights and support proactive performance tuning for PostgreSQL workloads. For more details read the Performance recommendations documentation. Entra-ID group login using user credentials - Public Preview The public preview for Entra-ID group login using user credentials is now available. This feature simplifies user management and improves security within the Azure Database for PostgreSQL. This allows administrators and users to benefit from a more streamlined process like: Changes in Entra-ID group memberships are synchronized on a periodic 30min basis. This scheduled syncing ensures that access controls are kept up to date, simplifying user management and maintaining current permissions. Users can log in with their own credentials, streamlining authentication, and improving auditing and access management for PostgreSQL environments. As organizations continue to adopt cloud-native identity solutions, this update represents a major improvement in operational efficiency and security for PostgreSQL database environments. New Region Buildout: Austria East New region rollout! Azure Database for PostgreSQL flexible server is now available in Austria East, giving customers in and around the region lower latency and data residency options. This continues our mission to bring Azure PostgreSQL closer to where you build and run your apps. For the full list of regions visit: Azure Database for PostgreSQL Regions. LangChain and LangGraph connector We are excited to announce that native LangChain & LangGraph support is now available for Azure Database for PostgreSQL! This integration brings native support for Azure Database for PostgreSQL into LangChain or LangGraph workflows, enabling developers to use Azure PostgreSQL as a secure and high-performance vector store and memory store for their AI agents and applications. Specifically, this package adds support for: Microsoft Entra ID (formerly Azure AD) authentication when connecting to your Azure Database for PostgreSQL instances, and, DiskANN indexing algorithm when indexing your (semantic) vectors. This package makes it easy to connect LangChain to your Azure-hosted PostgreSQL instances whether you're building intelligent agents, semantic search, or retrieval-augmented generation (RAG) systems. Read more at https://aka.ms/azpg-agent-frameworks Active-Active Replication Guide We have published a new blog article that guides you through setting up active-active replication in Azure Database for PostgreSQL using the pglogical extension. This walkthrough covers the fundamentals of active-active replication, key prerequisites for enabling bi-directional replication, and step-by-step demo scripts for the setup. It also compares native and pglogical approaches helping you choose the right strategy for high availability, and multi-region resilience in production environments. Read more about the active-active replication guide on this blog. Azure Postgres Learning Bytes 🎓 Enabling Zone-Redundant High Availability for Azure Database for PostgreSQL Flexible Server Using APIs. High availability (HA) is essential for ensuring business continuity and minimizing downtime in production workloads. With Zone-Redundant HA, Azure Database for PostgreSQL Flexible Server automatically provisions a standby replica in a different availability zone, providing stronger fault tolerance against zone-level failures. This section will guide you on how to enable Zone-Redundant HA using REST APIs. Using REST APIs gives you clear visibility into the exact requests and responses, making it easier to debug issues and validate configurations as you go. You can use any REST API client tool of your choice to perform these operations including Postman, Thunder Client (VS Code extension), curl, etc. to send requests and inspect the results directly. Before enabling Zone-Redundant HA, make sure your server is on the General Purpose or Memory Optimized tier and deployed in a region that supports it. If your server is currently using Same-Zone HA, you must first disable it before switching to Zone-Redundant. Steps to Enable Zone-Redundant HA: Get an ARM Bearer token: Run this in a terminal where Azure CLI is signed in (or use Azure Cloud Shell) az account get-access-token --resource https://management.azure.com --query accessToken -o tsv Paste token in your API client tool Authorization: `Bearer <token>` </token> Inspect the server (GET) using the following URL: https://management.azure.com/subscriptions/{{subscriptionId}}/resourceGroups/{{resourceGroup}}/providers/Microsoft.DBforPostgreSQL/flexibleServers/{{serverName}}?api-version={{apiVersion}} In the JSON response, note: sku.tier → must be 'GeneralPurpose' or 'MemoryOptimized' properties.availabilityZone → '1' or '2' or '3' (depends which availability zone that was specified while creating the primary server, it will be selected by system if the availability zone is not specified) properties.highAvailability.mode → 'Disabled', 'SameZone', or 'ZoneRedundant' properties.highAvailability.state → e.g. 'NotEnabled','CreatingStandby', 'Healthy' If HA is currently SameZone, disable it first (PATCH) using API. Use the same URL in Step 3, in the Body header insert: { "properties": { "highAvailability": { "mode": "Disabled" } } } Enable Zone Redundant HA (PATCH) using API: Use the same URL in Step 3, in the Body header insert: { "properties": { "highAvailability": { "mode": "ZoneRedundant" } } } Monitor until HA is Healthy: Re-run the GET from Step 3 every 30-60 seconds until you see: "highAvailability": { "mode": "ZoneRedundant", "state": "Healthy" } Conclusion That’s all for our August 2025 feature updates! We’re committed to making Azure Database for PostgreSQL better with every release, and your feedback plays a key role in shaping what’s next. 💬 Have ideas, questions, or suggestions? Share them with us: https://aka.ms/pgfeedback 📢 Want to stay informed about the latest features and best practices? Follow us here for the latest announcements, feature releases, and best practices: Azure Database for PostgreSQL Blog More exciting improvements are on the way—stay tuned for what’s coming next!July 2025 Recap: Azure Database for PostgreSQL
Hello Azure Community, July delivered a wave of exciting updates to Azure Database for PostgreSQL! From Fabric mirroring support for private networking to cascading read replicas, these new features are all about scaling smarter, performing faster, and building better. This blog covers what’s new, why it matters, and how to get started. Catch Up on POSETTE 2025 In case you missed POSETTE: An Event for Postgres 2025 or couldn't watch all of the sessions live, here's a playlist with the 11 talks all about Azure Database for PostgreSQL. And, if you'd like to dive even deeper, the Ultimate Guide will help you navigate the full catalog of 42 recorded talks published on YouTube. Feature Highlights Upsert and Script activity in ADF and Azure Synapse – Generally Available Power BI Entra authentication support – Generally Available New Regions: Malaysia West & Chile Central Latest Postgres minor versions: 17.5, 16.9, 15.13, 14.18 and 13.21 Cascading Read Replica – Public Preview Private Endpoint and VNet support for Fabric Mirroring - Public Preview Agentic Web with NLWeb and PostgreSQL PostgreSQL for VS Code extension enhancements Improved Maintenance Workflow for Stopped Instances Upsert and Script activity in ADF and Azure Synapse – Generally Available We’re excited to announce the general availability of Upsert method and Script activity in Azure Data Factory and Azure Synapse Analytics for Azure Database for PostgreSQL. These new capabilities bring greater flexibility and performance to your data pipelines: Upsert Method: Easily merge incoming data into existing PostgreSQL tables without writing complex logic reducing overhead and improving efficiency. Script Activity: Run custom SQL scripts as part of your workflows, enabling advanced transformations, procedural logic, and fine-grained control over data operations. Together, these features streamline ETL and ELT processes, making it easier to build scalable, declarative, and robust data integration solutions using PostgreSQL as either a source or sink. Visit our documentation guide for Upsert Method and script activity to know more. Power BI Entra authentication support – Generally Available You can now use Microsoft Entra ID authentication to connect to Azure Database for PostgreSQL from Power BI Desktop. This update simplifies access management, enhances security, and helps you support your organization’s broader Entra-based authentication strategy. To learn more, please refer to our documentation. New Regions: Malaysia West & Chile Central Azure Database for PostgreSQL has now launched in Malaysia West and Chile Central. This expanded regional presence brings lower latency, enhanced performance, and data residency support, making it easier to build fast, reliable, and compliant applications, right where your users are. This continues to be our mission to bring Azure Database for PostgreSQL closer to where you build and run your apps. For the full list of regions visit: Azure Database for PostgreSQL Regions. Latest Postgres minor versions: 17.5, 16.9, 15.13, 14.18 and 13.21 PostgreSQL latest minor versions 17.5, 16.9, 15.13, 14.18 and 13.21 are now supported by Azure Database for PostgreSQL flexible server. These minor version upgrades are automatically performed as part of the monthly planned maintenance in Azure Database for PostgreSQL. This upgrade automation ensures that your databases are always running on the most secure and optimized versions without requiring manual intervention. This release fixes two security vulnerabilities and over 40 bug fixes and improvements. To learn more, please refer PostgreSQL community announcement for more details about the release. Cascading Read Replica – Public Preview Azure Database for PostgreSQL supports cascading read replica in public preview capacity. This feature allows you to scale read-intensive workloads more effectively by creating replicas not only from the primary database but also from existing read replicas, enabling two-level replication chains. With cascading read replicas, you can: Improve performance for read-heavy applications. Distribute read traffic more efficiently. Support complex deployment topologies. Data replication is asynchronous, and each replica can serve as a source for additional replicas. This setup enhances scalability and flexibility for your PostgreSQL deployments. For more details read the cascading read replicas documentation. Private Endpoint and VNET Support for Fabric Mirroring - Public Preview Microsoft Fabric now supports mirroring for Azure Database for PostgreSQL flexible server instances deployed with Virtual Network (VNET) integration or Private Endpoints. This enhancement broadens the scope of Fabric’s real-time data replication capabilities, enabling secure and seamless analytics on transactional data, even within network-isolated environments. Previously, mirroring was only available for flexible server instances with public endpoint access. With this update, organizations can now replicate data from Azure Database for PostgreSQL hosted in secure, private networks, without compromising on data security, compliance, or performance. This is particularly valuable for enterprise customers who rely on VNETs and Private Endpoints for database connectivity from isolated networks. For more details visit fabric mirroring with private networking support blog. Agentic Web with NLWeb and PostgreSQL We’re excited to announce that NLWeb (Natural Language Web), Microsoft’s open project for natural language interfaces on websites now supports PostgreSQL. With this enhancement, developers can leverage PostgreSQL and NLWeb to transform any website into an AI-powered application or Model Context Protocol (MCP) server. This integration allows organizations to utilize a familiar, robust database as the foundation for conversational AI experiences, streamlining deployment and maximizing data security and scalability. For more details, read Agentic web with NLWeb and PostgreSQL blog. PostgreSQL for VS Code extension enhancements PostgreSQL for VS Code extension is rolling out new updates to improve your experience with this extension. We are introducing key connections, authentication, and usability improvements. Here’s what we improved: SSH connections - You can now set up SSH tunneling directly in the Advanced Connection options, making it easier to securely connect to private networks without leaving VS Code. Clearer authentication setup - A new “No Password” option eliminates guesswork when setting up connections that don’t require credentials. Entra ID fixes - Improved default username handling, token refresh, and clearer error feedback for failed connections. Array and character rendering - Unicode and PostgreSQL arrays now display more reliably and consistently. Azure Portal flow - Reuses existing connection profiles to avoid duplicates when launching from the portal. Don’t forget to update to the latest version in the Marketplace to take advantage of these enhancements and visit our GitHub to learn more about this month’s release. Improved Maintenance Workflow for Stopped Instances We’ve improved how scheduled maintenance is handled for stopped or disabled PostgreSQL servers. Maintenance is now applied only when the server is restarted - either manually or through the 7-day auto-restart rather than forcing a restart during the scheduled maintenance window. This change reduces unnecessary disruptions and gives you more control over when updates are applied. You may notice a slightly longer restart time (5–8 minutes) if maintenance is pending. For more information, refer Applying Maintenance on Stopped/Disabled Instances. Azure Postgres Learning Bytes 🎓 Set Up HA Health Status Monitoring Alerts This section will talk about setting up HA health status monitoring alerts using Azure Portal. These alerts can be used to effectively monitor the HA health states for your server. To monitor the health of your High Availability (HA) setup: Navigate to Azure portal and select your Azure Database for PostgreSQL flexible server instance. Create an Alert Rule Go to Monitoring > Alerts > Create Alert Rule Scope: Select your PostgreSQL Flexible Server Condition: Choose the signal from the drop down (CPU percentage, storage percentage etc.) Logic: Define when the alert should trigger Action Group: Specify where the alert should be sent (email, webhook, etc.) Add tags Click on “Review + Create” Verify the Alert Check the Alerts tab in Azure Monitor to confirm the alert has been triggered. For deeper insight into resource health: Go to Azure Portal > Search for Service Health > Select Resource Health. Choose Azure Database for PostgreSQL Flexible Server from the dropdown. Review the health status of your server. For more information, check out the HA Health status monitoring documentation guide. Conclusion That’s a wrap for our July 2025 feature updates! Thanks for being part of our journey to make Azure Database for PostgreSQL better with every release. We’re always working to improve, and your feedback helps us do that. 💬 Got ideas, questions, or suggestions? We’d love to hear from you: https://aka.ms/pgfeedback 📢 Want to stay on top of Azure Database for PostgreSQL updates? Follow us here for the latest announcements, feature releases, and best practices: Azure Database for PostgreSQL Blog Stay tuned for more updates in our next blog!Bringing Generative AI to Your Data: Semantic Operators in Azure Database for PostgreSQL
by: Arda Aytekin, PhD, Principal Software Engineer; Binnur Gorer, PhD, Senior Software Engineer; Serdar Mumcu, Principal Software Engineer While vector search has been a crucial component in Generative AI (GenAI) applications and agents, it only scratches the surface of what is possible. Vector search often fails to capture the intricate semantic relationships within enterprise data, leading to a significant loss of valuable knowledge at the foundational level. This limitation restricts the agent’s ability to fully understand and reason about the data, as the semantic connections remain hidden and this loss propagates upward. However, with the increasing power, affordability, and accessibility of Large Language Models (LLMs), we now have the tools to uncover these semantic relationships and deliver more accurate results. Although Azure AI Foundry provides powerful APIs for working with these advanced AI models, integrating them into data-centric applications often requires orchestration layers and external services, which can add operational overhead. To eliminate this friction, we are introducing Semantic Operators—is_true, extract, generate, and rank—within the azure_ai extension for Azure Database for PostgreSQL. These operators leverage the capabilities of LLMs directly from SQL workflows, enabling truth evaluation, high-level data extraction, intelligent text generation, and document reranking, respectively. This innovation empowers you to unlock deeper insights and elevate the performance of your AI applications. Meet the Operators We introduce four core semantic operators, each designed to address distinct categories of semantic analysis challenges. Their real power comes from the flexible input options and the variety of supported LLMs, enabling our users to explore the potential of LLMs for their specific use cases and challenges. Truth Evaluation azure_ai.is_true evaluates whether a natural language statement is likely true. It is ideal for filtering noisy or off-topic content. Below is a simple example of how to use the operator: WITH statements(stmt) AS ( VALUES ('Surface Pro is made with recycled content, including 100% recycled aluminum alloy in the enclosure of 12-inch and 13-inch devices.'), ('Surface Pro is a product of an EU-based startup.'), ('My laptop has a great battery life.') ) SELECT stmt, azure_ai.is_true(stmt) AS truthy FROM statements; which gives the following result set (subject to variability of LLMs’ responses): stmt truthy Surface Pro is made with recycled content, including 100% recycled aluminum alloy in the enclosure of 12-inch and 13-inch devices. True Surface Pro is a product of an EU-based startup. False My laptop has a great battery life. NULL As we can observe from the result set, azure_ai.is_true outputs NULL when the truthiness of the value cannot be determined and issues a warning with a detailed message (intentionally left out for brevity). High-Level Data Extraction azure_ai.extract extracts structured high-level data (e.g., features, entities) from unstructured text. It is particularly useful for identifying named entities, actions, or other key elements within a text, making it easier to organize and analyze information. Below is a simple example from a review processing task: WITH reviews(review) AS ( VALUES ('The product has a great battery life.'), ('Noise cancellation does not work as advertised. Avoid this product.'), ('The product has a good design, but it is a bit heavy. Not recommended for travel.'), ('Music quality is good but call quality could have been better.') ) SELECT review, azure_ai.extract( 'Concisely extract features from the review, leave n/a or 0 when unsure: ' || review, ARRAY[ 'battery: string - battery life of the product', 'design: string - design features of the product', 'sound: string - sound quality (e.g., music, call, noise cancellation) of the product', 'sentiment: number - sentiment score of the review; 1 (lowest) to 5 (highest)' ] ) AS data FROM reviews; This query yields the following result set (again, subject to variability of LLMs’ responses): review data The product has a great battery life. {"sound": "n/a", "design": "n/a", "battery": "great battery life", "sentiment": 5} Noise cancellation does not work as advertised. Avoid this product. {"sound": "Noise cancellation does not work as advertised.", "design": "n/a", "battery": "n/a", "sentiment": 1} The product has a good design, but it is a bit heavy. Not recommended for travel. {"sound": "n/a", "design": "good design, but a bit heavy", "battery": "n/a", "sentiment": 3} Music quality is good but call quality could have been better. {"sound": "Music quality is good, call quality could have been better.", "design": "n/a", "battery": "n/a", "sentiment": 3} Intelligent Text Generation azure_ai.generate generates freeform text using an LLM based on the input and system prompts. It can be used to summarize, translate, rewrite, or respond to content, providing users with the ability to create coherent and contextually appropriate text effortlessly. Additionally, its capabilities extend beyond these functions, as users can craft different prompts and provide a JSON schema to instruct the LLM to generate structured information for their specific use cases. Revisiting the previous review processing task, we can get a one-sentence summary of the reviews by employing the freeform version of this operator: WITH reviews(review) AS ( VALUES ('The product has a great battery life.'), ('Noise cancellation does not work as advertised. Avoid this product.'), ('The product has a good design, but it is a bit heavy. Not recommended for travel.'), ('Music quality is good but call quality could have been better.') ) SELECT azure_ai.generate( string_agg('Review: ' || review, '\n'), system_prompt => 'Summarize the reviews in a single sentence concisely.' ) AS summary FROM reviews; We get the following response from the LLM: summary The product has good battery life, design, and music quality, but falls short on noise cancellation, call quality, and portability. Document Reranking azure_ai.rank re-ranks a list of documents based on their relevance to a query. It empowers businesses to substantially improve the accuracy and relevance of information retrieved in search and retrieval-augmented generation (RAG) systems. In addition to supporting LLMs for reranking, the operator also accommodates cross-encoder models specifically designed for this function. To understand this operator’s power, let us revisit the previous review processing task and ask ourselves the question: “Which review best answers the customer query: ‘clear calling capability that blocks out background noise’?” Here is an example SQL query with azure_ai.rank: WITH reviews(id, review) AS ( VALUES (1, 'The product has a great battery life.'), (2, 'Noise cancellation does not work as advertised. Avoid this product.'), (3, 'The product has a good design, but it is a bit heavy. Not recommended for travel.'), (4, 'Music quality is good but call quality could have been better.') ) SELECT rank, id, review FROM azure_ai.rank( 'clear calling capability that blocks out background noise', ARRAY(SELECT review FROM reviews ORDER BY id ASC), ARRAY(SELECT id FROM reviews ORDER BY id ASC) ) rr LEFT JOIN reviews r USING (id) ORDER BY rank ASC; The above query yields the following reranked reviews (lower rank means higher relevance): rank id review 1 2 Noise cancellation does not work as advertised. Avoid this product. 2 4 Music quality is good but call quality could have been better. 3 3 The product has a good design, but it is a bit heavy. Not recommended for travel. 4 1 The product has a great battery life. Let us explore how these operators work in a real-world scenario. Use Case: Processing Product Descriptions and Reviews Imagine you're managing a platform that collects thousands of user reviews for various products daily. With such a vast amount of freeform, user-generated content, efficiently analyzing and surfacing the most meaningful insights become crucial. By leveraging automation and AI-driven techniques, you can streamline the review processing workflow to improve searchability, relevance, and moderation efficiency. For this use case, we will work on the Agentic Shop dataset, which contains product descriptions and the corresponding user reviews for three categories of products: headphones, smartwatches and tablets. The product table has IDs, names, categories and freeform text descriptions of the products, as well as the descriptions’ embeddings. Similarly, the review table has review IDs, product IDs and freeform review text, as well as the reviews’ embeddings. Both tables have indexes on the embedding columns to support vector (similarity) search. Step by step, we will perform the following operations: Do semantic similarity search on product descriptions and reviews, Improve the semantic search using reranking, Filter out the false positives using azure_ai.is_true, Extract high-level data to build a graph on top of products and reviews, and, Answer complex questions using the graph and semantic operators. With the help of semantic operators, you can do all of this in SQL! Semantic Similarity Search With the tables, embeddings and indexes at hand, let us try to list the Top 10 headphones with “good clear calling capability and good reviews about battery life.” By using semantic similarity search alone, you can try listing the Top 10 related products first, and then get the Top 10 semantically most relevant reviews for them: WITH potential_headphones AS ( SELECT id AS product_id, name, description FROM products WHERE category = 'headphones' ORDER BY description_emb <=> azure_openai.create_embeddings( 'text-embedding-3-small', 'good clear calling' )::vector ASC LIMIT 10 ) SELECT p.product_id, r.id AS review_id, p.name AS product_name, p.description, r.review_text FROM potential_headphones p LEFT JOIN reviews r USING (product_id) ORDER BY review_text_emb <=> azure_openai.create_embeddings( 'text-embedding-3-small', 'good battery life' )::vector ASC LIMIT 10; The query yields the following result set (kept to the Top 5 results and redacted for brevity): product_id review_id product_name description review_text 11 247 [redacted] [redacted] [redacted] 11 244 [redacted] [redacted] [redacted] 15 339 [redacted] [redacted] [redacted] 10 220 [redacted] [redacted] [redacted] 5 114 [redacted] [redacted] [redacted] When we inspect the result set carefully, we notice, for instance, that even though the product description for product_id=15 contains product features around “superior noise cancellation” and “high-quality sound performance,” it does not specifically mention “good calling capability.” This is of course expected due to the nature of semantic similarity search and our choice of first retrieving the top products and then fetching the topmost relevant reviews. Reranking the Results To alleviate the previous issue and improve the semantic similarity search results, we can benefit from azure_ai.rank. Let us try to improve the previous query by fetching the Top 50 relevant reviews first, and then reranking the results by using a cross-encoder: WITH potential_headphones AS ( SELECT id AS product_id, name, description FROM product WHERE category = 'headphones' ORDER BY description_emb <=> azure_openai.create_embeddings( 'text-embedding-3-small', 'good clear calling' )::vector ASC LIMIT 10 ), potential_headphones_reviews AS ( SELECT p.product_id, r.id AS review_id, p.name AS product_name, p.description, r.review_text, ROW_NUMBER() OVER () AS row_id FROM potential_headphones p LEFT JOIN review r USING (product_id) ORDER BY review_text_emb <=> azure_openai.create_embeddings( 'text-embedding-3-small', 'good battery life' )::vector ASC LIMIT 50 ), reranked_results AS ( SELECT id AS row_id, rank FROM azure_ai.rank( 'good clear calling capability and good battery life', ARRAY( SELECT 'Product Description: ' || description || '. Review: ' || review_text FROM potential_headphones_reviews ORDER BY row_id ASC ), ARRAY( SELECT row_id FROM potential_headphones_reviews ORDER BY row_id ASC ) ) ) SELECT phr.product_id, phr.review_id, phr.product_name, phr.description, phr.review_text FROM potential_headphones_reviews phr LEFT JOIN reranked_results rr USING (row_id) ORDER BY rr.rank ASC LIMIT 10; Now, this improved query yields the following result set: product_id review_id product_name description review_text 14 316 [redacted] [redacted] [redacted] 14 319 [redacted] [redacted] [redacted] 8 184 [redacted] [redacted] [redacted] 19 440 [redacted] [redacted] [redacted] 8 176 [redacted] [redacted] [redacted] As you can see, we have received different, and semantically more relevant, results than before. Specifically, we do not see product_id=15 in the Top 5 reranked results, anymore. This time, however, when we inspect the results carefully, we notice, for example, that the review text for review_id=319 is as follows: I enjoy these headphones, but they aren't without their flaws. The audio clarity is splendid, and I do find them quite comfortable, even after extended use. However, the battery life is not as long as advertised; I often find myself charging them more frequently than I would like. Still, they do provide solid sound for an overall great listening experience, especially for the price. The issue here is that the result set now has reviews that have better (semantic) relevance scores than before when compared to the provided natural language query, but the model is not able to differentiate negative reviews from the positive ones. Filtering out the False Positives To get rid of the false positives that appear in the reranked reviews, we leverage azure_ai.is_true as a filtering/where condition on the reranked results: WITH potential_headphones AS ( -- as before ), potential_headphones_reviews AS ( -- as before ), reranked_results AS ( -- as before ) SELECT phr.product_id, phr.review_id, phr.product_name, phr.description, phr.review_text FROM potential_headphones_reviews phr LEFT JOIN reranked_results rr USING (row_id) WHERE azure_ai.is_true( 'The following product description and review satisfies the following customer ask: ' || 'good clear calling capability and good battery life' || '. Product Description: ' || phr.description || '. Review: ' || phr.review_text ) ORDER BY rr.rank ASC LIMIT 10; Finally, we get the following result set, which excludes the false positives from the reranked results of the previous query: product_id review_id product_name description review_text 14 316 [redacted] [redacted] [redacted] 8 184 [redacted] [redacted] [redacted] 19 440 [redacted] [redacted] [redacted] 19 435 [redacted] [redacted] [redacted] 8 186 [redacted] [redacted] [redacted] Extracting High-Level Data to Build a Graph Extracting high-level data from product descriptions and reviews provides structured insights into customer experiences. By using azure_ai.extract, you can identify key aspects of the product being discussed and the overall sentiment of the review by configuring the description of the data you want to extract. After obtaining the required high-level data in a structured format, you can build even more complex relationships on your datasets by using a graph representation of data. To this end, you can leverage the Apache AGE extension in Azure Database for PostgreSQL. For our use case, let us assume that the following key aspects of the products are important for our customers: design (e.g., dimensions, shape, weight, material), sound quality (e.g., sound/music quality, bass, treble), and water resistance. As such, we need to extract these high-level data from product descriptions and store them as features of the product in, say, JSONB format. Similarly, we can extract the same set of features, if mentioned at all, from the reviews, together with the sentiments as inferred from the review text by the LLMs. By using azure_ai.extract, we can achieve this in two steps… one for the product descriptions: SELECT id, azure_ai.extract( description, ARRAY[ 'design: string - comma separated list of design features of the product (e.g., dimensions, shape, weight, material); empty if not mentioned', 'soundQuality: string - comma separated list of sound quality features of the product (e.g., sound/music quality, bass, treble); empty if not mentioned', 'waterResistance: string - comma separated list of water resistance features of the product; empty if not mentioned' ] ) AS features FROM product; and another one for the reviews: SELECT id, azure_ai.extract( review_text, ARRAY[ 'design: string - comma separated list of design features of the product (e.g., dimensions, shape, weight, material); empty if not mentioned', 'designSentiment: string - overall sentiment of the design features of the product, e.g., positive, negative, neutral, or n/a if not applicable', 'soundQuality: string - comma separated list of sound quality features of the product (e.g., sound/music quality, bass, treble); empty if not mentioned', 'soundQualitySentiment: string - overall sentiment of the sound quality features of the product, e.g., positive, negative, neutral, or n/a if not applicable', 'waterResistance: string - comma separated list of water resistance features of the product; empty if not mentioned', 'waterResistanceSentiment: string - overall sentiment of the water resistance features of the product, e.g., positive, negative, neutral, or n/a if not applicable', 'overallSentiment: string - overall sentiment of the review, e.g., positive, negative, neutral, or n/a if not applicable' ] ) AS features FROM review; The queries above help extract the high-level data we need to populate a knowledge graph with the following relationship diagram: Relationship diagram among products, their features, and the reviews that mention these features. Products have a "has-a" relationship with both features and reviews, and reviews have a "mentions-a" relationship with the features. Here, product nodes can contain some convenient data in addition to the product IDs (e.g., name, category), feature nodes contain the feature names (e.g., design), and review nodes contain the review IDs. Similarly, the directed edges from product nodes to feature nodes contain feature data (e.g., extracted data from product descriptions for the design feature), edges from product nodes to review nodes contain the overall sentiment of the review, and edges from review nodes to feature nodes contain both the feature data (e.g., extracted data from product reviews for the design feature) and the corresponding feature sentiment (e.g., extracted sentiment from product reviews specifically for the design feature, if at all). The graph representation depicted in Figure 1 can be implemented easily with the help of the Apache AGE extension in Azure Database for PostgreSQL. Answering Complex Questions Finally, after we build the graph representation of the relationship among products, reviews and the features of interest, we are ready to answer complex questions in SQL. Let us imagine that we would like to answer the following challenging question: “What are the Top 3 lightweight and waterproof headphones with many good reviews about music quality?” To answer this question, we can start with the following graph query: SELECT * FROM cypher( 'products_reviews_features_graph', $$ MATCH (p:product {category: 'headphones'})-[:HAS_FEATURE]->(:feature {name: 'design'}) WITH p MATCH (p)-[:HAS_FEATURE]->(:feature {name: 'waterResistance'}) WITH p MATCH (p)-[:HAS_REVIEW]->(r:review)-[:MENTIONS_FEATURE {sentiment: 'positive'}]->(:feature {name: 'soundQuality'}) RETURN p.id, r.id $$ ) AS result(product_id integer, review_id integer); This query lists all the ID pairs of products that are in the headphones category and have design and water resistance features specified, and those reviews that have positive sentiment about the specific product’s sound quality features. However, our featurization process in the previous section implies that the design feature of a product specification contains not only the weight of the product but also other design elements such as dimensions, shape and material. As such, we need to join the graph query with a common table expression that filters out those product descriptions that do not explicitly specify the weight of the product. Finally, let us also summarize the design and water resistance features of the product, as well as all the review elements considering the sound quality of the product. The final query looks like this: WITH graph_query AS ( SELECT * FROM cypher( 'shop_graph', $$ MATCH (p:product {category: 'headphones'})-[:HAS_FEATURE]->(:feature {name: 'design'}) WITH p MATCH (p)-[:HAS_FEATURE]->(:feature {name: 'waterResistance'}) WITH p MATCH (p)-[:HAS_REVIEW]->(r:review)-[:MENTIONS_FEATURE {sentiment: 'positive'}]->(:feature {name: 'soundQuality'}) RETURN p.id, r.id $$ ) AS result(product_id integer, review_id integer) ), detailed_data AS ( SELECT g.product_id, g.review_id, p.name, p.features->>'design' AS design, p.features->>'waterResistance' AS water_resistance, r.features->>'soundQuality' AS sound_quality FROM graph_query g JOIN product p ON g.product_id = p.id JOIN review r ON g.review_id = r.id ), filtered_products AS ( SELECT DISTINCT ON (product_id) product_id FROM detailed_data WHERE azure_ai.is_true('Product spec specifically mentions lightweight design: ' || design) ) SELECT f.product_id AS id, d.name, azure_ai.generate( 'Design features: ' || d.design || '. Water resistance features: ' || d.water_resistance, system_prompt => 'Summarize the features of the product concisely.' ) AS product_summary, azure_ai.generate( string_agg('Review: ' || d.sound_quality, ','), system_prompt => 'Summarize the reviews concisely.' ) AS review_summary, COUNT(d.review_id) AS review_count FROM filtered_products f JOIN detailed_data d ON f.product_id = d.product_id GROUP BY f.product_id, d.name, d.design, d.water_resistance ORDER BY review_count DESC LIMIT 3; We get the following result set: id name product_summary review_summary review_count 6 [redacted] [redacted] [redacted] 16 8 [redacted] [redacted] [redacted] 13 7 [redacted] [redacted] [redacted] 12 There we have it – Top 3 products with the desired features and the most positive reviews about sound quality! Let us examine the top match for example: ID. 6 Name. Nexvo Wireless Earbuds – 48H Playback, LED Display, Earhooks, Mic Product Summary. The product features a sport-focused design with three pairs of ear caps and flexible earhooks for a secure fit during outdoor activities. It includes a charging case with a dual LED display and offers water resistance suitable for use in rainy conditions. Review Summary. The reviews consistently praise the product's sound quality, highlighting it as crystal clear, rich, full, and immersive with good bass response and outstanding clarity. Review Count. 16 Final Thoughts Semantic operators in Azure PostgreSQL are more than just wrappers around LLMs—they represent a paradigm shift in how we think about AI in data systems. By embedding generative AI directly into SQL, we are not only simplifying the integration process but also empowering developers to build smarter, faster, and more maintainable applications. This innovative approach allows for seamless interaction with AI capabilities, enabling developers to leverage the full potential of LLMs to solve complex problems, enhance data analysis, enrich available data with high-level features, and create more intuitive user experiences. The flexibility and power of these operators open new possibilities for innovation, making it easier to develop cutting-edge applications that can adapt to evolving needs and challenges.1.3KViews3likes0CommentsInnovating with PostgreSQL @Build
At this year's Microsoft Build, we're excited to share the latest updates and innovations in Microsoft Azure Database for PostgreSQL. Whether you're building AI powered apps and agents or just looking to uplevel your PostgreSQL experience, we've got sessions packed with insights and tools tailored for developers and technical leaders. As a fully managed, AI-ready open source relational database that offers 58% cost savings over an on-premises PostgreSQL database – Azure Database for PostgreSQL enhances your security, scalability, and management of enterprise workloads. Check out what’s happening with Postgres at Build — in Seattle and online: 🔍 Breakout Sessions BRK211: Building Advanced Agentic Apps with PostgreSQL on Azure What benefits do agentic architectures bring compared to traditional RAG patterns? Find out how we answer this question by exploring advanced agentic capabilities offered by popular GenAI frameworks (LangChain, LlamaIndex, Semantic Kernel) and how they transform RAG applications built on Azure Database for PostgreSQL. Learn how to further improve agentic apps by integrating advanced RAG techniques, making vector search faster with the DiskANN vector search algorithm, and more accurate with Semantic Ranking and GraphRAG. BRK204: What’s New in Microsoft Databases: Empowering AI-Driven App Dev Explore advanced applications powered by Microsoft databases on-premises, on Azure and in Microsoft Fabric. Uncover innovative approaches to scalability and learn about intelligent data processing with AI-driven insights and agentic integrations. See new features with engaging demos across all databases including Azure Database for PostgreSQL. 💻 Demo Session DEM564: Boost Your Development Workflows with PostgreSQL Discover how to transform your development workflow on PostgreSQL. Whether you're building AI-powered apps, managing complex datasets, or just looking to streamline your PostgreSQL experience, this demo will show you how to level up your productivity with PostgreSQL on Azure. 🧪 Hands-On Labs LAB360: Build an Agentic App with PostgreSQL, GraphRAG, and Semantic Kernel Sign up to get hands-on experience building an agent-driven, RAG-based application with Azure Database for PostgreSQL and VS Code. Explore coding and architectural concepts while using DiskANN Index for Vector Search, and integrating Apache AGE for PostgreSQL to extend into a GraphRAG pattern leveraging the Semantic Kernel Agent Framework. 💬 Meet the Experts Have questions? Looking to talk open source, AI agentic apps, or migration? Visit us in the Expert Meetup Zone to connect with the Postgres product teams, engineers, and architects. 🔎 How to find it: Log into the Microsoft Build 2025 website or use the official event mobile app to view the venue map and session schedule. 📍 To find the Expert Meetup zone, check out the official MS Build Event Guide for a venue maps and other logistical information. 🐘Get Started with Azure Database for PostgreSQL Want to try it out firsthand? 🚀 Start building 📘 Explore the documentation Let's connect, code, and grow together at Build 2025!PostgreSQL - Flexible Compute - VNet DNS Update
Hello, I have a customer planning to update the VNet DNS from custom to Default. The client is running a PostgreSQL - Flexible server with VNet integration. Does the DNS change impact PostgreSQL - Flexible compute? What is the best way to update the PostgreSQL - Flexible compute without having to restart? Thanks, Terru99Views0likes2CommentsThe History of Microsoft Azure
Learn about the history of Microsoft Azure a leading giant in the cloud service industry which offers rich services on platform as a service (PaaS), software as a service (Saas) and infrastructure as a service (IaaS). This will take us back to the moments on how this powerful and sophisticated service began, revealing the resilience and vision of the Microsoft company as a brand, the present stages and how to partake of the cake Microsoft has provided for businesses and developers.71KViews5likes3CommentsDP-900: Microsoft Azure Data Fundamentals Study Guide
Microsoft Azure provides an array of services that enable businesses and organizations to undergo digital transformation by making quick and informed decisions. The DP-900 Microsoft Azure Data Fundamentals exam evaluates learners' understanding of data concepts such as relational, non-relational, big data, and analytics. The exam requires learners to demonstrate their knowledge of core data concepts, relational and non-relational data, and Azure data services. Microsoft offers resources such as Microsoft Learn self-paced curriculum, an instructor-led course, and related documentation to help students prepare for the exam. Understanding Azure data principles is vital for more advanced Azure certifications like Azure Database Administrator Associate and Azure Data Engineer Associate.14KViews8likes0Comments