azure ai

10 Topics

Build Enterprise-Ready AI Agents with the New Azure Postgres LangChain + LangGraph Connector
AI agents are only as powerful as the data layer behind them. That’s why we’re excited to announce native LangChain + LangGraph connector for Azure Database for PostgreSQL. With this release, Postgres becomes your single source of truth for AI agents, handling knowledge retrieval, chat history, and long-term memory all in one place. This new connector is packed with everything you need to build secure, scalable and enterprise-ready AI agents on Azure without the complexity. With EntraID authentication, DiskANN acceleration, vector store, and a dedicated agent store, you can go from prototype to production on Azure faster than ever. You can quickly get started with the LangChain + LangGraph connector today pip install langchain-azure-postgresql In this post, we’ll cover: How Azure Postgres connector for LangGraph can serve as the single persistence + retrieval layer for an AI agent New first-class connector for LangChain +LangGraph A practical example to help you get started Azure PostgreSQL as the single persistence + retrieval layer for an AI agent When building AI agents today, developers face a fragmented stack: Vector storage and search require a library, service or separate database. Chat history & short-term memory need yet another data source. Long-term memory often means bolting on yet another system. This sprawl leads to complex integrations, higher costs, and weaker security, making it hard to scale AI agents reliably. The Solution The new Azure Postgres connector for LangChain + LangGraph transforms your Azure Postgres database to the single persistence + retrieval layer for AI agents. Instead of working on a fragmented stack, developers can now: Run embeddings + semantic search with built-in DiskANN acceleration in the same database that powers their application logic. Persist chat history and short-term memory and keep agent conversations grounded via seamless context retrieval from data stored in Postgres. Capture, retrieve, and evolve knowledge over time with a built-in long-term memory without bolting on external systems. All in one database, simplified, secure, and enterprise ready. Postgres becomes the persistent and retrieval data layer for your AI agent. Built for Enterprise Readiness: LangChain + LangGraph Connector This release unlocks several new capabilities that make it easy to build robust, production-ready agents: Auth with EntraID: Enterprise-grade identity to securely connect LangChain + LangGraph workflows to Azure Database for PostgreSQL within a centrally managed security perimeter based on identity. DiskANN & Extensions: First-class support for faster vector search using pgvector combined with DiskANN indexing, enabling support for high-dimensional vectors and cost-efficient search. Additionally, helper functions ensure your favorite extensions are installed. Native Vector Store: Store and query embeddings, enabling semantic search and Retrieval-Augmented Generation (RAG) scenarios. Dedicated Agent Store: Persist agent state, memory, and chat history with structured access patterns, perfect for multi-turn conversations and long-term context. Together, these features give developers a turnkey persistence solution for building reliable AI agents without stitching together multiple storage systems. Using LangGraph on Azure Database for PostgreSQL Using LangGraph with Azure Database for PostgreSQL is easy. Enable the vector & pg_diskann Extension: Allowlist the vector and pg_diskann extension within your server configuration.   Import LangChain + LangGraph connector pip install langchain-azure-postgresql pip install -qU langchain-openai pip install -qU azure-identity Login to Azure, to your Entra ID Run az login in your terminal, where you will also run the LangGraph code. az login To get started, you need to set up a production-ready vector store for your agent in a few lines of code. # 1. Auth: Securely connect to Azure Postgres connection_pool = AzurePGConnectionPool(azure_conn_info=ConnectionInfo(host=os.environ["PGHOST"])) #2. Create embeddings embeddings = AzureOpenAIEmbeddings(model="text-embedding-3-small") # 3. Initialize a vector store in Postgres with DiskANN vector_store = AzurePGVectorStore(connection=connection, embedding=embeddings) Use LangGraph to build a sample agent. Here’s a practical example that combines vector search and checkpointer inside Postgres: #4 Define the tool for data retrieval. def get_data_from_vector_store(query: str) -> str: """Get data from the vector store.""" results = vector_store.similarity_search(query) return results #5 Define the agent, checkpointer and memory store. with connection_pool.getconn() as conn: agent = create_react_agent( model=model, tools=[get_data_from_vector_store], checkpointer=PostgresSaver(conn) ) #6 Run the agent and print results config = {"configurable": {"thread_id": "1", "user_id": "1"}} response = agent.invoke( {"messages": [{"role": "user", "content": "What does my database say about cats? Make sure you address me with my name"}]}, config ) for msg in response["messages"][-2:]: msg.pretty_print() With just a few lines of code, you can: Uses the vector store backed by Postgres Enable DiskANN for semantic search Use checkpointers for short-term conversation history Learn More This is just the beginning. With native LangChain + LangGraph support in Azure PostgreSQL, developers can now rely on a single, secure, high-performance data layer for building the next generation of AI agents. 👉 Ready to start? All the code are available in the Azure Postgres Agents Demo GitHub repository. See how easy it is to bring your AI agent to life on Azure. 👉 Check out the docs for more details on the LangChain + LangGraph connector.
abeomor-msft
Oct 21, 2025 Place Microsoft Blog for PostgreSQL
3.5KViews
3likes
0Comments
March 2024 Recap: Azure Database for PostgreSQL Flexible Server
March 2024 Feature Recap: Azure PostgreSQL Flexible Server - New Features and Enhancements
varun-dhawan
Aug 28, 2025 Place Microsoft Blog for PostgreSQL
9.1KViews
2likes
0Comments
May 2024 Recap: Azure Database for PostgreSQL Flexible Server
By Varun Dhawan, Principal PM. May 2024 Feature Recap: Azure PostgreSQL Flexible Server - New Features and Enhancements
varun-dhawan
Aug 28, 2025 Place Microsoft Blog for PostgreSQL
7.5KViews
1like
0Comments
UBS unlocks advanced AI techniques with PostgreSQL on Azure
This blog was authored by Jay Yang, Executive Director, and Orhun Oezbek, GenAI Architect, UBS RiskLab UBS Group AG is a multinational investment bank and world-leading asset manager that manages $5.7 trillion in assets across 15 different markets. We continue to evolve our tools to suit the needs of data scientists and to integrate the use of AI. Our UBS RiskLab data science platform helps over 1,200 UBS data scientists expedite development and deployment of their analytics and AI solutions, which support functions such as risk, compliance, and finance, as well as front-office divisions such as investment banking and wealth management. RiskLab and UBS GOTO (Group Operations and Technology Office) have a long-term AI strategy to provide a scalable and easy-to-use AI platform. This strategy aims to remove friction and pain points for users, such as developers and data scientists, by introducing DevOps automation, centralized governance and AI service simplification. These efforts have significantly democratized AI development for our business users. This blog walks through how we created two RiskLab products using Azure services. We also explain how we’re using Azure Database for PostgreSQL to power advanced Retrieval Augmented-Generation (RAG) techniques—such as new vector search algorithms, parameter tuning, hybrid search, semantic ranking, and a graphRAG approach—to further the work of our financial generative AI use cases. The RiskLab AI Common Ecosystem (AICE) provides fully governed and simplified generative AI platform services, including: Governed production data access for AI development Managed large language model (LLM) endpoints access control Tenanted RAG environments Enhanced document insight AI processing Streamlined AI agent standardization, development, registration, and deployment solutions End-to-end machine learning (ML) model continuous integration, training, deployment, and monitoring processes  The AICE Vector Embedding Governance Application (VEGA) is a fully governed and multi-tenant vector store built on top of Azure Database for PostgreSQL that provides self-service vector store lifecycle management and advanced indexing and retrieval techniques for financial RAG use cases. A focus on best practices like AIOps and MLOps As generative AI gained traction in 2023, we noticed the need for a platform that simplified the process for our data scientists to build, test, and deploy generative AI applications. In this age of AI, the focus should be on data science best practices—GenAIOps and MLOps. Most of our data scientists aren’t fully trained on MLOps, GenAIOps, and setting up complex pipelines, so AICE was designed to provide automated, self-serve DevOps provisioning of the Azure resources they need, as well as simplified MLOps and AIOps pipelines libraries. This removes operational complexities from their workflows. The second reason for AICE was to make sure our data scientists were working in fully governed environments that comply with data privacy regulations from the multiple countries in which UBS operates. To meet that need, AICE provides a set of generative AI libraries that fully manages data governance and reduces complexity. Overall, AICE greatly simplifies the work for our data scientists. For instance, the platform provides managed Azure LLM endpoints, MLflow for generative AI evaluation, and AI agent deployment pipelines along with their corresponding Python libraries. Without going into the nitty gritty of setting up a new Azure subscription, managing MLFlow instances, and navigating Azure Kubernetes Service (AKS) deployments, data scientists can just write three lines of code to obtain a fully governed and secure generative AI ecosystem to manage their entire application lifecycle. And, as a governed, secure lab environment, they can also develop and prototype ML models and generative AI applications in the production tier. We found that providing production read-only datasets to build these models significantly expedites our AI development. In fact, the process for developing an ML model, building a pipeline for model training, and putting it into production has dropped from six months to just one month. Azure Database for PostgreSQL and pgvector: The best of both worlds for relational and vector databases Once AICE adoption ramped up, our next step was to develop a comprehensive, flexible vector store that would simplify vector store resource provisioning while supporting hundreds of RAG use cases and tenants across both lab and production environments. Essentially, we needed to create RAG as a Service (RaaS) so our data scientists could build custom AI solutions in a self-service manner. When we started building VEGA and this vector store, we anticipated that effective RAG would require a diverse range of search capabilities covering not only vector searches but also more traditional document searches or even relational queries. Therefore, we needed a database that could pivot easily. We were looking for a really flexible relational database and decided on Azure Database for PostgreSQL. For a while, Azure Database for PostgreSQL has been our go-to database at RiskLab for our structured data use cases because it’s like the Swiss Army Knife of databases. It’s very compact and flexible, and we have all the tools we need in a single package. Azure Database for PostgreSQL offers excellent relational queries and JSONB document search. When used in conjunction with the pgvector extension for vector search, we created some very powerful hybrid search and hierarchical search RAG functionalities for our end users. The relational nature of Azure Database for PostgreSQL also allowed us to build a highly regulated authorization and authentication mechanism that makes it easy and secure for data scientists to share their embeddings. This involved meeting very stringent access control policies so that users’ access to vector stores is on a need-to-know basis. Integrations with the Azure Graph API help us manage those identities and ensure that the environment is fully secure. Using VEGA, data scientists can just click a button to add a user or group and provide access to all their embeddings/documents. It’s very easy, but it’s also governed and highly regulated. Speeding vector store initialization from days to seconds With VEGA, the time it takes to provision a vector store has dropped from days to less than 30 seconds. Instead of waiting days on a request for new instances of Azure Database for PostgreSQL, pgvector, and Azure AI Search, data scientists can now simply write five lines of code to stand up virtual, fully governed, and secure collections. And the same is true for agentic deployment frameworks. This speed is critical for lab work that involves fast iterations and experiments. And because we built on Azure Database for PostgreSQL, a single instance of VEGA can support thousands of vector stores. It’s cost-effective and seamlessly scales. Creating a hybrid search to analyze thousands of documents Since launching VEGA, one of the top hybrid search use cases has been Augmented Indexing Search (AIR Search), allowing data scientists to comb through financial documents and pinpoint the correct sections and text. This search uses LLMs as agents that first filter based on metadata stored in JSONB columns of the Azure Database for PostgreSQL, then apply vector similarity retrieval. Our thousands of well-structured financial documents are built with hierarchical headers that act as metadata, providing a filtering mechanism for agents and allowing them to retrieve sections in our documents to find precisely what they’re looking for. Because these agents are autonomous, they can decide on the best tools to use for the situation—either metadata filtering or vector similarity search. As a hybrid search, this approach also minimizes AI hallucinations because it gives the agents more context to work with. To enable this search, we used ChatGPT and Azure OpenAI. But because most of our financial documents are saved as PDFs, the challenge was retaining hierarchical information from headers that were lost when simply dumping in text from PDFs. We also had to determine how to make sure ChatGPT understood the meaning behind aspects like tables and figures. As a solution, we created PNG images of PDF pages and told ChatGPT to semantically chunk documents by titles and headers. And if it came across a table, we asked it to provide a YAML or JSON representation of it. We also asked ChatGPT to interpret figures to extract information, which is an important step because many of our documents contain financial graphs and charts. We’re now using Azure AI Document Intelligence for layout detection and section detection as the first step, which simplified our document ingestion pipelines significantly. Forecasting economic implications with PostgreSQL Graph Extension Since creating AICE and VEGA using Azure services, we’ve significantly enhanced our data science workflows. We’ve made it faster and easier to develop generative AI applications thanks to the speed and flexibility of Azure Database for PostgreSQL. Making advanced AI features accessible to our data scientists has accelerated innovation in RiskLab and ultimately allowed UBS to deliver exceptional value to our customers. Looking ahead, we plan to use the Apache AGE graph extension in Azure Database for PostgreSQL for macroeconomics knowledge retention capabilities. Specifically, we’re considering Azure tooling such as GraphRAG to equip UBS economist and portfolio managers with advanced RAG capabilities. This will allow them to retrieve more coherent RAG search results for use cases such as economics scenario generation and impact analysis, as well as investment forecasting and decision-making. For instance, a UBS business user will be able to ask an AI agent: if a country’s interest rate increases by a certain percentage, what are the implications to my client’s investment portfolio? The agent can perform a graph search to obtain all other connected economic entity nodes that might be affected by the interest rate entity node in the graph. We anticipate the AI-assisted graph knowledge will gain significant traction in the financial industry. Learn more For a deeper dive on how we created AICE and VEGA, check out this on-demand session from Ignite. We talk through our use of Azure Database for PostgreSQL and pgvector, plus we show a demo of our GraphRAG capabilities. About Azure Database for PostgreSQL Azure Database for PostgreSQL is a fully managed, scalable, and secure relational database service that supports open-source PostgreSQL. It enables organizations to build and manage mission-critical applications with high availability, built-in security, and automated maintenance.
maxluk
Aug 21, 2025 Place Microsoft Blog for PostgreSQL
999Views
1like
0Comments
Fueling the Agentic Web Revolution with NLWeb and PostgreSQL
We’re excited to announce that NLWeb (Natural Language Web), Microsoft’s open project for natural language interfaces on websites now supports PostgreSQL. With this enhancement, developers can leverage PostgreSQL and NLWeb to transform any website into an AI-powered application or Model Context Protocol (MCP) server. This integration allows organizations to utilize a familiar, robust database as the foundation for conversational AI experiences, streamlining deployment and maximizing data security and scalability. Soon, autonomous agents, not just human users, will consume and interpret website content, transforming how information is accessed and utilized online. During Microsoft //Build 2025, Microsoft introduced the era of the open agentic web, in which the internet is an open agentic web a new paradigm in which autonomous agents seamlessly interact across individual, organizational, team and end-to-end business contexts. To realize the future of an open agentic web, Microsoft announced the NLWeb project. NLWeb transforms any website to an AI-powered application with just a few lines of code and by connecting to an AI model and a knowledge base. In this post, we’ll cover: What NLWeb is and how it works with vector databases How pgvector enables vector similarity search in PostgreSQL for NLWeb Get started using NLWeb with Postgres Let’s dive in and see how Postgres + NLWeb can redefine conversational web interfaces while keeping your data in a familiar, powerful database. What is NLWeb? A Quick Overview of Conversational Web Interfaces NLWeb is an open project developed by Microsoft to simplify adding conversational AI interfaces to websites. How NLWeb works under the hood: Processes existing data/website content that exists in semi-structured formats like Schema.org, RSS, and other data that websites already publish Embeds and indexes all the content in a vector store (i.e PostgreSQL with pgvector) Routes user queries through several processes which handle natural langague understanding, reranking and retrieval. Answers queries with an LLM The result is a high-quality natural language interface on top of web data, giving developers the ability to let users “talk to” web data. By default, every NLWeb instance is also a Model Context Protocol (MCP) server, allowing websites to make their content discoverable and accessible to agents and other participants in the MCP ecosystem if they choose. Importantly, NLWeb is platform-agnostic and supports many major operating systems, AI models, and vector stores and the NLWeb project is modular by design, so developers can bring their own retrieval system, model APIs, and define their own extensions. NLWeb with PostgreSQL PostgreSQL is now embedded into the NLWeb reference stack as a native retriever, creating a scalable and flexible path for deploying NLWeb instances using open-source infrastructure. Retrieval Powered by pgvector NLWeb leverages pgvector, a PostgreSQL extension for efficient vector similarity search, to handle natural language retrieval at scale. By integrating pgvector into the NLWeb stack, teams can eliminate the need for external vector databases. Web data stored in PostgreSQL becomes immediately searchable and usable for NLWeb experiences, streamlining infrastructure and enhancing security. PostgreSQL's robust governance features and wide adoption align with NLWeb’s mission to enable conversational AI for any website or content platform. With pgvector retrieval built in, developers can confidently launch NLWeb instances on their own databases no additional infrastructure required. Implementation example We are going to use NLWeb and Postgres, to create a conversational AI app and MCP server that will let us chat with content from the Talking Postgres with Claire Giordano Podcast! Prerequisites An active Azure account. Enable and configure the pg_vector extensions. Create an Azure AI Foundry project. Deploy models gpt-4.1, gpt-4.1-mini and text-embedding-3-small. Install Visual Studio Code. Install the Python extension. Install Python 3.11.x. Install the Azure CLI (latest version). Getting started All the code and sample datasets are available in this GitHub repository. Step 1: Setup NLWeb Server 1. Clone or download the code from the repo. git clone https://github.com/microsoft/NLWeb cd NLWeb 2. Open a terminal to create a virtual python environment and activate it. python -m venv myenv source myenv/bin/activate # Or on Windows: myenv\Scripts\activate 3. Go to the 'code/python' folder in NLWeb to install the dependencies. cd code/python pip install -r requirements.txt 4. Go to the project root folder in NLWeb and copy the .env.template file to a new .env file cd ../../ cp .env.template .env 5. In the .env file, update the API key you will use for your LLM endpoint of choice and update the Postgres connection string. For example: AZURE_OPENAI_ENDPOINT="https://TODO.openai.azure.com/" AZURE_OPENAI_API_KEY="<TODO>" # If using Postgres connection string POSTGRES_CONNECTION_STRING="postgresql://<HOST>:<PORT>/<DATABASE>?user=<USERNAME>&sslmode=require" POSTGRES_PASSWORD="<PASSWORD>" 6. Update your config files (located in the config folder) to make sure your preferred providers match your .env file. There are three files that may need changes. config_llm.yaml: Update the first line to the LLM provider you set in the .env file. By default it is Azure OpenAI. You can also adjust the models you call here by updating the models noted. By default, we are assuming 4.1 and 4.1-mini. config_embedding.yaml: Update the first line to your preferred embedding provider. By default it is Azure OpenAI, using text-embedding-3-small. config_retrieval.yaml: Update the first line to postgres. You should update write_endpoint to postgres and You should update postgres retrieval endpoint is enabled to 'true' in the following list of possible endpoints. Step 2: Initialize Postgres Server Go to the 'code/python/misc folder in NLWeb to run Postgres initializer. NOTE: If you are using Azure Postgres Flexible server make sure you have `vector` extension allow-listed and make sure the database has the vector extension enabled, cd code/python/misc python postgres_load.py Step 3: Ingest Data from Talk Postgres Podcast Now we will load some data in our local vector database to test with. We've listed a few RSS feeds you can choose from below. Go to the 'code/python folder in NLWeb and run the command. The format of the command is as follows (make sure you are still in the 'python' folder when you run this): python -m data_loading.db_load <RSS URL> <site-name> Talking Postgres with Claire Giordano Podcast: python -m data_loading.db_load https://feeds.transistor.fm/talkingpostgres Talking-Postgres (Optional) You can check the documents table in your Postgres database and verify the table looks like the one below. To verify all the data from the website was uploaded. Test NLWeb Server Start your NLWeb server (again from the 'python' folder): python app-file.py Go to http://localhost:8000/ Start ask questions about the Talking Postgres with Claire Giordano Podcast, you may try different modes. Trying List Mode: Sample Prompt: “I want to listen to something that talks about the advances in vector search such as DiskANN” Trying Generate Mode Sample Prompt: “What did Shireesh Thota say about the future of Postgres?” Running NLWeb with MCP 1. If you do not already have it, install MCP in your venv: pip install mcp 2. Next, configure your Claude MCP server. If you don’t have the config file already, you can create the file at the following locations: macOS: ~/Library/Application Support/Claude/claude_desktop_config.json Windows: %APPDATA%\Claude\claude_desktop_config.json The default MCP JSON file needs to be modified as shown below: macOS Example Configuration { “mcpServers”: { “ask_nlw”: { “command”: “/Users/yourname/NLWeb/myenv/bin/python”, “args”: [ “/Users/yourname/NLWeb/code/chatbot_interface.py”, “—server”, “http://localhost:8000”, “—endpoint”, “/mcp” ], “cwd”: “/Users/yourname/NLWeb/code” } } } Windows Example Configuration { “mcpServers”: { “ask_nlw”: { “command”: “C:\\Users\\yourusername\\NLWeb\\myenv\\Scripts\\python”, “args”: [ “C:\\Users\\yourusername\\NLWeb\\code\\chatbot_interface.py”, “—server”, “http://localhost:8000”, “—endpoint”, “/mcp” ], “cwd”: “C:\\Users\\yourusername\\NLWeb\\code” } } } Note: For Windows paths, you need to use double backslashes (\\) to escape the backslash character in JSON. 3. Go to the 'code/python’ folder in NLWeb and run the command. Enter your virtual environment and start your NLWeb local server. Make sure it is configured to access the data you would like to ask about from Claude. # On macOS source ../myenv/bin/activate python app-file.py # On Windows ..\myenv\Scripts\activate python app-file.py 4. Open Claude Desktop. It should ask you to trust the 'ask_nlw' external connection if it is configured correctly. After clicking yes and the welcome page appears, you should see 'ask_nlw' in the bottom right '+' options. Select it to start a query. 5. To query NLWeb, just type 'ask_nlw' in your prompt to Claude. You'll notice that you also get the full JSON script for your results. Remember, you must have your local NLWeb server started to use this option. Learn More Vector Store in Azure Postgres Flexible Server Generative AI in Azure Postgres Flexible Server NLWeb GitHub repo includes: A reference server for handling natural language queries PGvector integration
abeomor-msft
Jul 31, 2025 Place Microsoft Blog for PostgreSQL
675Views
3likes
1Comment
DiskANN on Azure Database for PostgreSQL – Now Generally Available
By Abe Omorogbe, Senior PM We’re thrilled to announce the General Availability (GA) of DiskANN for Azure Database for PostgreSQL unlocking fast, scalable, and cost-effective vector search for production workloads. Building on momentum from our private and public previews, this release brings major upgrades that directly reflect customer feedback for better performance, lower memory usage, and greater flexibility for advanced GenAI applications. Whether you're working with massive datasets or deploying on resource-constrained environments, DiskANN now offers an index that scales effortlessly. DiskANN delivers up to 10x faster speed, 4x lower costs and up to 96x lower memory footprint compared to the industry standard pgvector HNSW. In this post, we’ll highlight the following: Common pain points in large-scale vector search New features in the GA release Dive into product quantization (PQ) the main optimization that powers DiskANN’s performance Share internal testing results that demonstrate how DiskANN stacks up against alternatives like HNSW. Read on to see why DiskANN is ready for your most demanding vector search workloads. What is DiskANN? Developed by Microsoft Research and battle-tested across global services like Bing and Microsoft 365, DiskANN is a high-performance approximate nearest neighbor (ANN) search algorithm built for scalable vector search. It delivers the high recall, high throughput, and low latency required by today’s most demanding agentic AI and retrieval-augmented generation (RAG) workloads. DiskANN offers the following benefits: Low Latency: Its graph-based index structure minimizes SSD reads during search, enabling high throughput and consistently low query latency. Cost Efficiency: DiskANN’s design reduces memory usage up to 96x smaller than standard indexing methods helping lower infrastructure costs. Scalability: Optimized for massive datasets, DiskANN is built to efficiently handle millions of vectors, making it ideal for production-scale applications. Accuracy: DiskANN delivers highly accurate results without sacrificing speed or precision. Integration: DiskANN works natively with Azure Database for PostgreSQL, leveraging the power and flexibility of PostgreSQL. Breaking Through the Limits of Large-Scale Vector Search Vector search has become essential for powering AI applications from recommendation systems to agentic AI but scaling it has been anything but easy. If you've worked with large vector datasets, you've likely run into the same roadblocks: Your data is too big to fit in memory leading to slower searches. Building indexes takes forever and eats up your resources. You have no idea how long the indexing process will take or where it’s stuck. Your embedding model outputs high-dimensional vectors, but your database can’t handle them. Database bills spiral out of control due to memory intensive machines needed for efficient search on a large dataset. Sound familiar? These are not edge cases they’re the standard challenges faced by anyone trying to scale Postgres’s vector search capabilities into real-world production workloads. With the General Availability (GA) release of DiskANN for Azure Database for PostgreSQL, we’re tackling these problems head-on, bringing production-ready scale, speed, and efficiency to vector search. Let’s break down how. Product Quantization (PQ) for Lower Memory and Storage Costs (preview) One of the biggest blockers in vector search is fitting your data into memory. When using pgvector’s HNSW and your vector data doesn't fit in memory, this can lead to compute intensive I/O operations, causing degraded performance. With the GA release, DiskANN introduces a preview version of Product Quantization (PQ)—a powerful vector compression technique that makes it possible to store and search massive datasets with a dramatically smaller memory footprint. With PQ enabled, you get: Reduced memory usage — enabling datasets that previously couldn’t fit in RAM. Lower memory costs — compressed vectors mean smaller indexes and cheaper monthly bills. Faster performance — less I/O pressure means lower latency and higher throughput. Example results In our internal testing, we use pg_diskann on Azure Postgres to build an index of 35 million 768D vectors and ran benchmarking queries on an 8-core 32GB machine. The results were: 32x lower memory footprint than using pgvector’s HNSW and 4x lower cost due to significantly less resources needed to run vector search queries effectively compared to HNSW. Also, compared to standard HSNW, pg_diskann delivers up to 10x lower latency @ 95% recall especially in large scale scenarios with millions of vectors. When testing higher quality embedding such as OpenAI v3-large (3072 dimensions), we saw up to 96x lower memory footprint, due to extremely efficient compressing. In this scenario PQ compresses each vector from 12KB (3072 D, 4 bytes/D) to just 128B per quantized vector. Sign up for the preview today! To get access. Go Big: Supports vectors up to 16,000 dimensions Another big blocker for customers developing advanced GenAI applications with pgvector is that HNSW only supports indexing vectors up to 2,000 dimensions a limit that constrains the development of applications using high-dimensional embedding models which deliver high accuracy (i.e. text-embedding-large). With this release, DiskANN now supports vectors up to 16,000 dimensions. When you have product quantization enabled. Popular embedding models with over 2000 dimensions (text-embedding-large, E5-mistral-7b-instruct and NV-embed-v2) Faster Index Builds, Smarter Memory Usage Index creation has historically been a pain point, especially in previous versions of pg_diskann—especially for large datasets. In this GA release, we’ve significantly accelerated the build process through: Improved memory management using `maintenance_work_mem` more efficiently. Optimized algorithms that reduce disk I/O and CPU usage during indexing We’ve also published detailed documentation to guide you through best practices for faster index builds. The result? Index builds that are not only faster but more predictable and resource friendly. When indexing 1 millions vectors, the DiskANN GA version is ~2x faster. It took 696.0630 seconds vs 1172.3314 seconds in our DiskANN preview build. Real-Time Index Progress Tracking Previously, with pg_diskann building large indexes felt like working in the dark. Now, with the addition of improved progress reporting support, you can track exactly how far along your index build is—making it easier to monitor, plan, and troubleshoot during creation. Checking index build progress with PSQL in VSCode Use the following command in PSQL to check pg_diskann index build progress. SELECT phase, round(100.0 * blocks_done / nullif(blocks_total, 0), 1) AS "%" FROM pg_stat_progress_create_index; Using DiskANN on Azure Database for PostgreSQL Using DiskANN on Azure Database for PostgreSQL is easy. Enable the pgvector & diskann Extension: Allowlist the pgvector and diskann extension within your server configuration. Activating DiskANN in Azure Database for PostgreSQL Create Extension in Postgres: Create the pg_diskann extension on your database along with any dependencies. CREATE EXTENSION IF NOT EXISTS pg_diskann CASCADE; Create a Vector Column: Define a table to store your vector data, including a column of type vector for the vector embeddings. CREATE TABLE demo ( id INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, embedding public.vector(3) ); INSERT INTO demo (embedding) VALUES ('[1.0, 2.0, 3.0]'), ('[4.0, 5.0, 6.0]'), ('[7.0, 8.0, 9.0]'); Index the Vector Column: Create an index on the vector column to optimize search performance. The pg_diskann PostgreSQL extension is compatible with pgvector, it uses the same types, distance functions and syntactic style. To use Product Quanatization sign up for the preview today! CREATE INDEX demo_embedding_diskann_idx ON demo USING diskann (embedding vector_cosine_ops) Perform Vector Searches: Use SQL queries to search for similar vectors based on various distance metrics (cosine similarity in the example below). SELECT id, embedding FROM demo ORDER BY embedding <=> '[2.0, 3.0, 4.0]' LIMIT 5; Ready to Dive In? DiskANN’s GA release transforms PostgreSQL into a fully capable vector search platform for production AI workloads. It delivers: Support for millions of compressed vectors Compatibility with pgvector Reduced memory and storage costs Faster index creation Support for high-dimensional vectors Real-time indexing progress visibility Whether you’re building an enterprise-scale retrieval system or optimizing costs in a lean AI application, Use the DiskANN today and explore the future of AI-driven applications with the power of Azure Database for PostgreSQL! Run our end-to-end sample RAG app with DiskANN Learn More DiskANN on Azure Database for PostgreSQL is ready for production workloads. With Product Quantization, support for high-dimensional vectors, faster index creation, and clearer operational visibility, you can now scale your vector search applications even further — all while keeping costs low. To learn more, check out our documentation and start building today!
abeomor-msft
May 19, 2025 Place Microsoft Blog for PostgreSQL
991Views
2likes
0Comments
November 2023 Recap: Azure Database for PostgreSQL Flexible Server
November 2023 Azure Database for PostgreSQL Flexible Server Updates: New Features and Enhancements.
varun-dhawan
Apr 08, 2025 Place Microsoft Blog for PostgreSQL
9.9KViews
3likes
4Comments
Build AI Agents with Azure Database for PostgreSQL and Azure AI Agent Service
Introduction AI agents are revolutionizing how applications interact with data by combining large language models (LLMs) with external tools and databases. This blog will show you how to combine Azure Database for PostgreSQL with Azure AI Agent Service to create intelligent AI agents that can search and analyze your data. We'll use a legal research assistant as our example and walk through setup, implementation, and testing. With just a few hours of work, you can build an AI solution that would have taken weeks of traditional development. Why AI Agents Matter AI agents can improve productivity by handling repetitive, time-consuming tasks. AI agents can transform how businesses interact with their data by automating complex workflows, providing more accurate information retrieval, and enabling natural language interfaces to databases. What are AI agents? AI agents go beyond simple chatbots by combining large language models (LLMs) with external tools and databases. Unlike standalone LLMs or standard RAG systems, AI agents can: Plan: Break down complex tasks into smaller, sequential steps. Use Tools: Leverage APIs, code execution, search systems to gather information or perform actions. Perceive: Understand and process inputs from various data sources. Remember: Store and recall previous interactions for better decision-making. By connecting AI agents to databases like Azure Database for PostgreSQL, agents can deliver more accurate, context-aware responses based on your data. AI agents extend beyond basic human conversation to carry out tasks based on natural language. These tasks traditionally required coded logic; however, agents can plan the tasks needed to execute based on user-provided context. Agents can be implemented using various GenAI frameworks including LangChain, LangGraph, LlamaIndex and Semantic Kernel. All these frameworks support using Azure Database for PostgreSQL as a tool. This uses the Azure AI Agents Service for agent planning, tool usage, and perception, while using Azure Database for PostgreSQL as a tool for vector database and semantic search capabilities. Real-World Use Case: Legal Research Assistant In this tutorial, we'll build an AI agent that helps legal teams research relevant cases to support their clients in Washington state. Our agent will: Accept natural language queries about legal situations. Use vector search in Azure Database for PostgreSQL to find relevant case precedents. Analyze and summarize the findings in a format useful for legal professionals. Prerequisites Azure Resources An active Azure account. Azure Database for PostgreSQL Flexible Server instance running PG 14 or higher. With pg_vector and azure_ai extensions enabled Azure AI Foundry Project Deployed Azure GPT-4o-mini endpoint. Deployed Azure text-embedding-small endpoint. Local Setup Install Visual Studio Code. Install the Python extension. Install Python 3.11.x. Install Azure CLI.(latest version) Project Implementation All the code and sample datasets are available in this GitHub repository. Step 1: Set Up Vector Search in Azure Database for PostgreSQL First, we'll prepare our database to store and search legal case data using vector embeddings: Environment Setup: If using macOS / bash: python -m venv .pg-azure-ai source .pg-azure-ai/bin/activate pip install -r requirements.txt Windows / PowerShell python -m venv .pg-azure-ai .pg-azure-ai \Scripts\Activate.ps1 pip install -r requirements.txt Windows / cmd.exe: python -m venv .pg-azure-ai .pg-azure-ai \Scripts\activate.bat pip install -r requirements.txt Configure Environment Variables: Create a .env file with your credentials: AZURE_OPENAI_API_KEY="" AZURE_OPENAI_ENDPOINT="" EMBEDDING_MODEL_NAME="" AZURE_PG_CONNECTION="" Load documents and vectors The Python file load_data/main.py serves as the central entry point for loading data into Azure Database for PostgreSQL. This code processes sample cases data, including information about cases in Washington. High level details of main.py: Database setup and Table Creation: Creates necessary extensions, sets up OpenAI API settings, and manages database tables by dropping existing ones and creating new ones for storing case data. Data Ingestion: Reads data from a CSV file and inserts it into a temporary table, then processes and transfers this data into the main cases table. Embedding Generation: Adds a new column for embeddings in the cases table and generates embeddings for case opinions using OpenAI's API, storing them in the new column. The embedding process will take ~3-5 minutes To start the data loading process, run the following command from the load_data directory: python main.py Here's the output of main.py: Extensions created successfully OpenAI connection established successfully Cases table created successfully Temp cases table created successfully Data loaded into temp_cases_data table successfully Data loaded into cases table successfully Adding Embeddings, this will take a while around 3-5 mins... Embeddings added successfully All Data loaded successfully! Step 2: Create Postgres tool for the Agent In this step we will be configuring AI agent tools to retrieve data from Postgres and then using the Azure AI Agent Service SDK to connect your AI agent to the Postgres database. Define a function for your agent to call Start by defining a function for your agent to call by describing its structure and any required parameters in a docstring. Include all your function definitions in a single file, legal_agent_tools.py which you can then import into your main script. def vector_search_cases(vector_search_query: str, start_date: datetime ="1911-01-01", end_date: datetime ="2025-12-31", limit: int = 10) -> str: """ Fetches the cases information in Washington State for the specified query. :param query(str): The query to fetch cases for specifically in Washington. :type query: str :param start_date: The start date for the search, defaults to "1911-01-01" :type start_date: datetime, optional :param end_date: The end date for the search, defaults to "2025-12-31" :type end_date: datetime, optional :param limit: The maximum number of cases to fetch, defaults to 10 :type limit: int, optional :return: Cases information as a JSON string. :rtype: str """ db = create_engine(CONN_STR) query = """ SELECT id, name, opinion, opinions_vector <=> azure_openai.create_embeddings( 'text-embedding-3-small', %s)::vector as similarity FROM cases WHERE decision_date BETWEEN %s AND %s ORDER BY similarity LIMIT %s; """ # Fetch cases information from the database df = pd.read_sql(query, db, params=(vector_search_query,datetime.strptime(start_date, "%Y-%m-%d"), datetime.strptime(end_date, "%Y-%m-%d"),limit)) cases_json = json.dumps(df.to_json(orient="records")) return cases_json Step 3: Create and Configure the AI Agent with Postgres Now we'll set up the AI agent and integrate it with our PostgreSQL tool. The Python file src/simple_postgres_and_ai_agent.py serves as the central entry point for creating and using your agent. High level details of simple_postgres_and_ai_agent.py: Create an Agent: Initializes the agent in your Azure AI Project with a specific model. Add Postgres tool: During the agent initialization, the Postgres tool to do vector search on your Postgres DB is added. Create a Thread: Sets up a communication thread. This will be used to send messages to the agent to process Run the Agent and Call Postgres tool: Processes the user's query using the agent and tools. The agent can plan with tools to use to get the correct answer. In this use case the agent will call the Postgres tool based on the function signature and docstring to do vector search and retrieve the relevant data to answer the question. Display the Agent’s Response: Outputs the agent's response to the user's query. Find the Project Connection String in Azure AI Foundry: In your Azure AI Foundry project you will find you Project Connection String from the Overview page of the project we will use this string to connect the project to the AI agent SDK. We will be adding this string to the .env file. Connection Setup: Add these variables to your .env file in the root directory: PROJECT_CONNECTION_STRING=" " MODEL_DEPLOYMENT_NAME="gpt-4o-mini" AZURE_TRACING_GEN_AI_CONTENT_RECORDING_ENABLED="true" Create the Agent with Tool Access We will create the agent in the AI Foundry project and add the Postgres tools needed to query to Database. The code snippet below is an excerpt from the file simple_postgres_and_ai_agent.py. # Create an Azure AI Client project_client = AIProjectClient.from_connection_string( credential=DefaultAzureCredential(), conn_str=os.environ["PROJECT_CONNECTION_STRING"], ) # Initialize agent toolset with user functions functions = FunctionTool(user_functions) toolset = ToolSet() toolset.add(functions) agent = project_client.agents.create_agent( model= os.environ["MODEL_DEPLOYMENT_NAME"], name="legal-cases-agent", instructions= "You are a helpful legal assistant that can retrieve information about legal cases.", toolset=toolset ) Create Communication Thread: This code snippet, shows how to create a thread and message for the agent. The thread and message will be what the agent processes in a run. # Create thread for communication thread = project_client.agents.create_thread() # Create message to thread message = project_client.agents.create_message( thread_id=thread.id, role="user", content="Water leaking into the apartment from the floor above, What are the prominent legal precedents in Washington on this problem in the last 10 years?" ) Process the Request: This code snippet creates a run for the agent to process the message and use the appropriate tools to provide the best result. Using the tool, the agent will be able to call your Postgres and the vector search on the query “Water leaking into the apartment from the floor above”, to retrieve the data it will need to answer the question best. from pprint import pprint # Create and process agent run in thread with tools run = project_client.agents.create_and_process_run( thread_id=thread.id, agent_id=agent.id ) # Fetch and log all messages messages = project_client.agents.list_messages(thread_id=thread.id) pprint(messages['data'][0]['content'][0]['text']['value']) Run the Agent: To run the agent, run the following command from the src directory: python simple_postgres_and_ai_agent.py The agent will produce a similar result as below using the Azure Database for PostgreSQL tool to access case data saved in the Postgres Database. Snippet of output from agent: 1. Pham v. Corbett Citation: Pham v. Corbett, No. 4237124 Summary: This case involved tenants who counterclaimed against their landlord for relocation assistance and breach of the implied warranty of habitability due to severe maintenance issues, including water and sewage leaks. The trial court held that the landlord had breached the implied warranty and awarded damages to the tenants. 2. Hoover v. Warner Citation: Hoover v. Warner, No. 6779281 Summary: The Warners appealed a ruling finding them liable for negligence and nuisance after their road grading project caused water drainage issues affecting Hoover's property. The trial court found substantial evidence supporting the claim that the Warners' actions impeded the natural flow of water and damaged Hoover's property. Step 4: Testing and Debugging with Azure AI Foundry Playground After running your agent with Azure AI Agent SDK, the agent will be stored in your project, and you can experiment with the agent in the Agent playground. Using the Agent Playground: Navigate to the Agents section in Azure AI Foundry Find your agent in the list and click to open Use the playground interface to test various legal queries Test the query “Water leaking into the apartment from the floor above, What are the prominent legal precedents in Washington?”. The agent will pick the right tool to use and ask for the expected output for that query. Use sample_vector_search_cases_output.json as the sample output. Step 5: Debugging with Azure AI Foundry Tracing When developing the agent by using the Azure AI Foundry SDK, you can also debug the agent with Tracing. You will be able to debug the calls to tools like Postgres as well as seeing how to agent orchestrated each task. Debugging with Tracing: Click Tracing in the Azure AI Foundry menu Create a new Application Insights resource or connect an existing one View detailed traces of your agent's operations Learn more about how to set up tracing with the AI agent and Postgres in the advanced_postgres_and_ai_agent_with_tracing.py file on Github. Get Started Today By combining Azure Database for PostgreSQL with Azure AI Agent Service, developers can create intelligent agents that automate data retrieval, improve decision-making, and unlock powerful insights. Whether you're working on legal research, customer support, or data analytics, this setup provides a scalable and efficient solution to enhance your AI applications. Ready to build your own AI agent? Try building your own legal agent with Azure AI agent service and Postgres. 1. Setup Azure AI Foundry and Azure Database for PostgreSQL Flexible Server Setup up an AI Foundry Project and deploy models Deploy the “gpt-4o-mini” model and “text-embedding-small” models Setup Azure Database for PostgreSQL Flexible Server and pg_vector extension Allow the azure_ai extension 2. Run our end-to-end sample AI Agent using Azure Database for PostgreSQL tool 3. Customize for your use case: Replace legal data with your domain-specific information Adjust agent instructions for your specific needs Add additional tools as required Learn More Read more able Azure Database for PostgreSQL and the Azure AI Agent service. Learn more about Vector Search in Azure Database for PostgreSQL Learn more about Azure AI Agent Service
abeomor-msft
Mar 13, 2025 Place Microsoft Blog for PostgreSQL
4.2KViews
1like
0Comments
Scalable Vector Search with DiskANN - Available to all Azure Database for PostgreSQL
We’re thrilled to announce the public preview of DiskANN on Azure Database for PostgreSQL is now open! No sign-up needed — it's available to all Azure Database for PostgreSQL customers right now. Based on your valuable feedback from our initial release in October, we've supercharged DiskANN with parallel index build for improved performance, numerous bug fixes, and enhanced stability. DiskANN enables developers to perform highly accurate and efficient vector searches on large vector datasets, making it an ideal solution for scaling Generative AI applications. Try DiskANN today and elevate your AI projects to the next level! What is DiskANN? Developed by Microsoft Research and used extensively at Microsoft in global services such as Bing and Microsoft 365, DiskANN is an approximate nearest neighbor search algorithm designed for efficient vector search at scale. It provides high recall, high throughput, and low query latency essential for modern AI and RAG applications. Why use Azure Database for PostgreSQL with DiskANN Vector Index? Scalability: DiskANN is optimized for large datasets, making it ideal for handling millions of vectors. Accuracy: DiskANN uses iterative post filtering to enhance the accuracy of filtered vector search results without compromising on speed or precision. Low Latency: The DiskANN graph index construction makes it very efficient during search, minimizing the number of SSD reads to achieve high throughput and low latency. Integration: Seamlessly integrates with Azure Database for PostgreSQL, leveraging the power and flexibility of PostgreSQL. Learn more about DiskANN from Microsoft. Benefits of using a vector index in your AI application Using a vector index in PostgreSQL, such as pg_diskann, dramatically improves query performance and reduces latency for high-dimensional data applications like search engines, recommendation systems, and e-commerce websites. Unlike brute-force search, vector indexes optimize similarity searches by organizing data for efficient nearest neighbor queries using metrics like cosine similarity, Euclidean distance, or inner product. They leverage approximate algorithms, such as DiskANN, to reduce the search space, enabling sub-linear query times even for datasets with millions of vectors. On average using a Vector Index you can achieve sub-10-millisecond query times on a 1-million-row dataset, while brute-force search could take ~200 milliseconds or more, making using Vector index ideal for real-time applications. For example, an Airbnb-style platform could use vector search to match a user's query with similar properties in the database, and the index allows the system to quickly surface the most relevant listings, transforming what could be seconds-long processing into millisecond responses, ensuring a fast and personalized search experience. Using DiskANN on Azure Database for PostgreSQL Using DiskANN on Azure Database for PostgreSQL is easy. Enable the pgvector & diskann Extension: Allowlist the pgvector and diskann extension within your server configuration. Create Extension in Postgres: Create the pg_diskann extension on your database along with any dependencies. CREATE EXTENSION IF NOT EXISTS pg_diskann CASCADE; Create a Vector Column: Define a table to store your vector data, including a column of type vector for the vector embeddings. CREATE TABLE demo ( id INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, embedding public.vector(3) ); INSERT INTO demo (embedding) VALUES ('[1.0, 2.0, 3.0]'), ('[4.0, 5.0, 6.0]'), ('[7.0, 8.0, 9.0]'); Index the Vector Column: Create an index on the vector column to optimize search performance. The pg_diskann PostgreSQL extension is compatible with pgvector, it uses the same types, distance functions and syntactic style. CREATE INDEX demo_embedding_diskann_idx ON demo USING diskann (embedding vector_cosine_ops) Perform Vector Searches: Use SQL queries to search for similar vectors based on various distance metrics (cosine similarity in the example below). SELECT id, embedding FROM demo ORDER BY embedding <=> '[2.0, 3.0, 4.0]' LIMIT 5; Ready to Dive In? Use the DiskANN preview today and explore the future of AI-driven applications with the power of Azure Database for PostgreSQL! Run our end-to-end sample RAG app with DiskANN Learn More Integrating DiskANN with Azure Database for PostgreSQL enables scalable, efficient AI applications. By leveraging advanced vector search capabilities, you can enhance the performance of your AI applications and deliver more accurate results faster than ever before. Learn more about DiskANN in Azure Database for PostgreSQL Azure Database for PostgreSQL in Semantic Kernel Azure Database for PostgreSQL | 🦜️🔗 LangChain DiskANN – Microsoft Research
abeomor-msft
Feb 08, 2025 Place Microsoft Blog for PostgreSQL
612Views
1like
0Comments
Introducing DiskANN Vector Index in Azure Database for PostgreSQL
We're thrilled to announce the preview of DiskANN, a leading vector indexing algorithm, on Azure Database for PostgreSQL - Flexible Server! Developed by Microsoft Research and used extensively at Microsoft in global services such as Bing and Microsoft 365, DiskANN enables developers to build highly accurate, performant and scalable Generative AI applications surpassing pgvector’s HNSW and IVFFlat in both latency and accuracy. DiskANN also overcomes a long-standing limitation of pgvector in filtered vector search, where it occasionally returns incorrect results.
abeomor-msft
Oct 09, 2024 Place Microsoft Blog for PostgreSQL
8.7KViews
5likes
0Comments