Azure AI Search’s new hybrid and vector search updates to boost GenAI app performance

Microsoft

May 21, 2024

When building RAG solutions, it’s important to ensure that you’re grounding your LLM with the highest quality results, for the best LLM performance. As part of our 2024-05-01-Preview API Version, we’re launching several new updates to our Azure AI Search relevance stack to give you more control of your retriever.

Support for Binary Vector Types

We've added support for binary vectors (bit vectors) enabling Azure AI Search to store and process embeddings that support binary outputs such as Cohere Embed V3. This new feature allows for larger vector datasets at a lower cost while maintaining or improving fast search capabilities. According to Cohere, binary embeddings can keep up to 95% of the search quality, and net space used for vector data is reduced by 32x.

Score Threshold Filtering

The new "threshold" property in Azure AI Search queries allows customers to improve search result relevance for vector and hybrid search queries. This feature filters out documents with low similarity scores before combining results from different recall sets. Whether you prefer filtering by 'searchScore' or 'vectorSimilarity,' you have the flexibility to prioritize the most relevant documents.

Vector Weighting

The new vector weighting feature in Azure AI Search specifies relative weight (importance) of vector queries to term queries in hybrid search scenarios. For example, you can favor vector similarity over keyword similarity. Also, a relative weight of different vector queries in multi-vector search requests can be defined to favor similarity of one vector field over another. The specified weights are used when calculating the scores of each document, giving you more control over the final result set.

MaxTextSizeRecall for Hybrid Search

Tailor your hybrid search experience with the ability to specify the maximum number of documents recalled in hybrid search queries from the keyword search recall set. You can also adjust the 'count' property to include all matching documents or only those retrieved within a defined window. This enhancement improves relevance and empowers you to control the number of documents retrieved. Additionally, reducing the number of text documents retrieved can significantly improve performance.

Document Boosting support for Vector/Hybrid Search

Boost your search results with scoring profiles tailored to vector and hybrid search queries. Whether you prioritize freshness, geolocation, or specific keywords, our new feature allows for targeted document boosting, ensuring more relevant results for your needs.

Experience up to a 50% Decrease in Query Latency for Hybrid Search

We've also been working hard on performance. With the latest set of improvements, customers are seeing up to 50% lower latency in hybrid search queries with no changes in their apps.

More news from Azure AI Search:

Announcing cost-effective RAG at scale with Azure AI Search
Streamlined multimodal search with AI Studio embedding models and Azure AI Search
Automatically index Microsoft Fabric OneLake files in Azure AI Search, now in preview