Azure Cognitive Search AMA: Vector search, Azure OpenAI Service, generative apps, plugins & more

yahnoosh
Microsoft
Jul 25, 2023
Vector search performance is a very interesting topic. The Approximate Nearest Neighbors methods, being approximate, use different approaches to influence the speed/recall tradeoff. In the limit, you can achieve perfect recall by comparing the query vector to all vectors in the database but that’s obviously too expensive. To improve speed, ANN algorithms leverage compression techniques lower precision, or data structures to partition the indexed data to reduce the number of vector comparisons that need to be made. All of it to say that statements about performance should only be made at a specific recall target. Another dimension to this problem is price since you can choose to spend more on hardware to improve search latency and throughput. In an ideal benchmark, you’d pick configurations comparable on price and recall before you look at search performance since they are all related.

Both Cognitive Search and Redis use HNSW as the Approximate Nearest Neighbors algorithm which is one of the leading methods for applications optimizing for low latency and high recall and data that’s indexed incrementally and can change over time. The main differences in price/performance could stem from differences in implementation and other functionalities the service offers in addition to Vector search which are included in the price i.e., scaling, security, compliance, integration with other services, etc.

Hope this helps you reason about Vector search performance in Azure Cognitive Search and how it compares. We’re planning to publish results of our benchmarks which compares Vector search performance between different Cognitive Search SKUs and service topologies – how adding replicas and partitions changes Vector search performance profile. The basic intuition is that adding replicas improves throughput and adding partitions reduces latency.

Event details