vector index

2 Topics

September 2024 Recap: Azure Database for PostgreSQL Flexible Server
September 2024 Feature Recap: Azure PostgreSQL Flexible Server - New Features and Enhancements
varun-dhawan
Aug 27, 2025 Place Microsoft Blog for PostgreSQL
6.9KViews
4likes
0Comments
DiskANN on Azure Database for PostgreSQL – Now Generally Available
By Abe Omorogbe, Senior PM We’re thrilled to announce the General Availability (GA) of DiskANN for Azure Database for PostgreSQL unlocking fast, scalable, and cost-effective vector search for production workloads. Building on momentum from our private and public previews, this release brings major upgrades that directly reflect customer feedback for better performance, lower memory usage, and greater flexibility for advanced GenAI applications. Whether you're working with massive datasets or deploying on resource-constrained environments, DiskANN now offers an index that scales effortlessly. DiskANN delivers up to 10x faster speed, 4x lower costs and up to 96x lower memory footprint compared to the industry standard pgvector HNSW. In this post, we’ll highlight the following: Common pain points in large-scale vector search New features in the GA release Dive into product quantization (PQ) the main optimization that powers DiskANN’s performance Share internal testing results that demonstrate how DiskANN stacks up against alternatives like HNSW. Read on to see why DiskANN is ready for your most demanding vector search workloads. What is DiskANN? Developed by Microsoft Research and battle-tested across global services like Bing and Microsoft 365, DiskANN is a high-performance approximate nearest neighbor (ANN) search algorithm built for scalable vector search. It delivers the high recall, high throughput, and low latency required by today’s most demanding agentic AI and retrieval-augmented generation (RAG) workloads. DiskANN offers the following benefits: Low Latency: Its graph-based index structure minimizes SSD reads during search, enabling high throughput and consistently low query latency. Cost Efficiency: DiskANN’s design reduces memory usage up to 96x smaller than standard indexing methods helping lower infrastructure costs. Scalability: Optimized for massive datasets, DiskANN is built to efficiently handle millions of vectors, making it ideal for production-scale applications. Accuracy: DiskANN delivers highly accurate results without sacrificing speed or precision. Integration: DiskANN works natively with Azure Database for PostgreSQL, leveraging the power and flexibility of PostgreSQL. Breaking Through the Limits of Large-Scale Vector Search Vector search has become essential for powering AI applications from recommendation systems to agentic AI but scaling it has been anything but easy. If you've worked with large vector datasets, you've likely run into the same roadblocks: Your data is too big to fit in memory leading to slower searches. Building indexes takes forever and eats up your resources. You have no idea how long the indexing process will take or where it’s stuck. Your embedding model outputs high-dimensional vectors, but your database can’t handle them. Database bills spiral out of control due to memory intensive machines needed for efficient search on a large dataset. Sound familiar? These are not edge cases they’re the standard challenges faced by anyone trying to scale Postgres’s vector search capabilities into real-world production workloads. With the General Availability (GA) release of DiskANN for Azure Database for PostgreSQL, we’re tackling these problems head-on, bringing production-ready scale, speed, and efficiency to vector search. Let’s break down how. Product Quantization (PQ) for Lower Memory and Storage Costs (preview) One of the biggest blockers in vector search is fitting your data into memory. When using pgvector’s HNSW and your vector data doesn't fit in memory, this can lead to compute intensive I/O operations, causing degraded performance. With the GA release, DiskANN introduces a preview version of Product Quantization (PQ)—a powerful vector compression technique that makes it possible to store and search massive datasets with a dramatically smaller memory footprint. With PQ enabled, you get: Reduced memory usage — enabling datasets that previously couldn’t fit in RAM. Lower memory costs — compressed vectors mean smaller indexes and cheaper monthly bills. Faster performance — less I/O pressure means lower latency and higher throughput. Example results In our internal testing, we use pg_diskann on Azure Postgres to build an index of 35 million 768D vectors and ran benchmarking queries on an 8-core 32GB machine. The results were: 32x lower memory footprint than using pgvector’s HNSW and 4x lower cost due to significantly less resources needed to run vector search queries effectively compared to HNSW. Also, compared to standard HSNW, pg_diskann delivers up to 10x lower latency @ 95% recall especially in large scale scenarios with millions of vectors. When testing higher quality embedding such as OpenAI v3-large (3072 dimensions), we saw up to 96x lower memory footprint, due to extremely efficient compressing. In this scenario PQ compresses each vector from 12KB (3072 D, 4 bytes/D) to just 128B per quantized vector. Sign up for the preview today! To get access. Go Big: Supports vectors up to 16,000 dimensions Another big blocker for customers developing advanced GenAI applications with pgvector is that HNSW only supports indexing vectors up to 2,000 dimensions a limit that constrains the development of applications using high-dimensional embedding models which deliver high accuracy (i.e. text-embedding-large). With this release, DiskANN now supports vectors up to 16,000 dimensions. When you have product quantization enabled. Popular embedding models with over 2000 dimensions (text-embedding-large, E5-mistral-7b-instruct and NV-embed-v2) Faster Index Builds, Smarter Memory Usage Index creation has historically been a pain point, especially in previous versions of pg_diskann—especially for large datasets. In this GA release, we’ve significantly accelerated the build process through: Improved memory management using `maintenance_work_mem` more efficiently. Optimized algorithms that reduce disk I/O and CPU usage during indexing We’ve also published detailed documentation to guide you through best practices for faster index builds. The result? Index builds that are not only faster but more predictable and resource friendly. When indexing 1 millions vectors, the DiskANN GA version is ~2x faster. It took 696.0630 seconds vs 1172.3314 seconds in our DiskANN preview build. Real-Time Index Progress Tracking Previously, with pg_diskann building large indexes felt like working in the dark. Now, with the addition of improved progress reporting support, you can track exactly how far along your index build is—making it easier to monitor, plan, and troubleshoot during creation. Checking index build progress with PSQL in VSCode Use the following command in PSQL to check pg_diskann index build progress. SELECT phase, round(100.0 * blocks_done / nullif(blocks_total, 0), 1) AS "%" FROM pg_stat_progress_create_index; Using DiskANN on Azure Database for PostgreSQL Using DiskANN on Azure Database for PostgreSQL is easy. Enable the pgvector & diskann Extension: Allowlist the pgvector and diskann extension within your server configuration. Activating DiskANN in Azure Database for PostgreSQL Create Extension in Postgres: Create the pg_diskann extension on your database along with any dependencies. CREATE EXTENSION IF NOT EXISTS pg_diskann CASCADE; Create a Vector Column: Define a table to store your vector data, including a column of type vector for the vector embeddings. CREATE TABLE demo ( id INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, embedding public.vector(3) ); INSERT INTO demo (embedding) VALUES ('[1.0, 2.0, 3.0]'), ('[4.0, 5.0, 6.0]'), ('[7.0, 8.0, 9.0]'); Index the Vector Column: Create an index on the vector column to optimize search performance. The pg_diskann PostgreSQL extension is compatible with pgvector, it uses the same types, distance functions and syntactic style. To use Product Quanatization sign up for the preview today! CREATE INDEX demo_embedding_diskann_idx ON demo USING diskann (embedding vector_cosine_ops) Perform Vector Searches: Use SQL queries to search for similar vectors based on various distance metrics (cosine similarity in the example below). SELECT id, embedding FROM demo ORDER BY embedding <=> '[2.0, 3.0, 4.0]' LIMIT 5; Ready to Dive In? DiskANN’s GA release transforms PostgreSQL into a fully capable vector search platform for production AI workloads. It delivers: Support for millions of compressed vectors Compatibility with pgvector Reduced memory and storage costs Faster index creation Support for high-dimensional vectors Real-time indexing progress visibility Whether you’re building an enterprise-scale retrieval system or optimizing costs in a lean AI application, Use the DiskANN today and explore the future of AI-driven applications with the power of Azure Database for PostgreSQL! Run our end-to-end sample RAG app with DiskANN Learn More DiskANN on Azure Database for PostgreSQL is ready for production workloads. With Product Quantization, support for high-dimensional vectors, faster index creation, and clearer operational visibility, you can now scale your vector search applications even further — all while keeping costs low. To learn more, check out our documentation and start building today!
abeomor-msft
May 19, 2025 Place Microsoft Blog for PostgreSQL
1.1KViews
2likes
0Comments