Event banner
Azure Cognitive Search AMA: Vector search, Azure OpenAI Service, generative apps, plugins & more
Event details
In this session we’ll answer questions about the emerging Retrieval-Augmented Generation pattern and how you can use Azure OpenAI service and Azure Cognitive Search to implement it today in your applications to power ChatGPT-like experiences, generative scenarios, and more. Bring your questions about vector search in Azure Cognitive Search, which is coming to public preview soon, as well as about implementation details, data preparation, integration with large language models, and anything else related to Azure Cognitive Search.
An AMA is a live text-based online event similar to an “Ask Me Anything” on Reddit. This AMA gives you the opportunity to connect with Microsoft product experts who will be on hand to answer your questions and listen to feedback.
Feel free to post your questions anytime in the comments below beforehand, if it fits your schedule or time zone better, though questions will not be answered until the live hour.
116 Comments
- Faddykenny_09Copper ContributorHow does the Azure OpenAI interface integrate with Microsoft Azure's existing services and platforms to enhance AI capabilities?
- DerekLegenzoffIron Contributor
One good example of how Azure OpenAI integrates with other products is vector search. You can Azure OpenAI to create embeddings from your content and load them into Azure Cognitive Search. We also created Azure OpenAI on your data to help you chat over data that lives in your search service: Introducing Azure OpenAI Service On Your Data in Public Preview - Microsoft Community Hub
Azure OpenAI service also supports function calling which gives you another way to integrate Azure OpenAI with othe Azure services.
- Faddykenny_09Copper ContributorGreat to know. The function calling is really helpful. It proceeds my expectation on the field activities and interaction.
- CarewCopper ContributorI'm applying for a job where the request is for Tech to do more to help the business, lean into business area, extract needs and deliver on Business needs and I'm looking into Power Platform and PowerBI, office 365 for the end 2 end AI Tech to Business experience to enable the Business users to do more with AI support and collaborate together for successful outcomes. Are there other offerings that we could / should be focused on for AI advantage?
- DerekLegenzoffIron Contributor
Hey Carew - this AMA is targeted for questions around Azure Cognitive Search so I don't think we'll be able to answer this question for you. You could try posting your question in the Power Platform Community.
- Barry BriggsCopper ContributorWhen using Azure OpenAI on your data, do the search terms retrieved from Azure Cognitive Search count toward the number of tokens and thus to the charge for Azure OpenAI? If so, is there any way for customers to predict or estimate the costs of using Azure OpenAI on your data?
- Barry BriggsCopper ContributorSuper helpful, thanks!
- robertlee-msft
Microsoft
Adding to Liam's answer, if you are only using vector search to retrieve documents, and not passing those documents to an Azure OpenAI prompt, then the usage of vector search by itself will not result in any calls to Azure OpenAI. Meaning, once you've generated the vector embeddings for your content (which will consume token counts in the embedding model/endpoint), there are no further costs beyond the search service itself. It's only when you pass the retrieved documents where additional costs can come into play. - liamca-msft
Microsoft
Hi Barry, yes, whatever is retrieved from Azure Cognitive Search and injected into the Azure OpenAI Prompt will count towards the token count (and thus the cost to execute that Azure OpenAI request. Pablo created a really good blog post that goes into more details on this. This is ultimately why the relevance of Azure Cognitive Search is so important and why we spend a lot of our time in making sure we can get the best results in the top 3-5 results to help minimize the number of tokens that are required. As for the cost, you will typically be "chunking" your data which should give you a good idea on the cost of any Azure OpenAI request as you can look at the typical number of tokens per chunk x the number of results you add to the prompt.
- Barry BriggsCopper Contributorthe link to Pablo's post seems to not be right?
- sshah145
Microsoft
What is the road map for vector search and what improvements and features are planned for future. It is in public preview right now, when is the plan to make it prod ready. Will you release RAG as service? where we can have configurations like embedding model endpoint and input store connection.- liamca-msft
Microsoft
Hi Sanket,
For the RAG as a service, I would highly recommend you take a look at Azure OpenAI on your Data as well as the related blog post, which I think is likely very close to what you might be looking for which is a really easy way to get up and running with RAG and enterprise ChatGPT. As for futures for vector search, our main goals are to simplify a lot of the complexities that come with vectorization of content which include looking at how to simplify ingestion and queries when vectorization is required as well as topics such as effective chunking of data.
- Adam KochCopper ContributorAdam from Ohio and Todd from Texas: [Added] To use Snowflake as a data source, are you aware of an officially-supported module or do you have a recommended shim approach to use one of the published sources? (https://learn.microsoft.com/EN-US/AZURE/search/search-data-sources-gallery)
- liamca-msft
Microsoft
Hi Adam,
One good option is to use Azure Data Factory, which has a Snowflake connector. to ingest data. Using this, you can also take the data copied and send it to Azure Cognitive Search. It is however important to note that you can also do this programmatically if you prefer using our PUSH api which allows you to format your data in JSON format and send it directly to Azure Cognitive Search
- Murthy582Copper Contributoris Vector Search going to have its own ranking? like Semantic ranking?
- robertlee-msft
Microsoft
Could you explain what you mean by vector search ranking? Semantic ranking is a re-ranking model so it’s slightly different. - DerekLegenzoffIron Contributor
There is new ranking for vector search: vector search queries can be ranked using cosine similarity, euclidean distance, or dot products. Azure Cognitive Search also supports hyrbid search which combines keyword based search and vector search results using reciprocal rank fusion.
This document has a great overview of how ranking works with vector search: Query vector data in a search index - Azure Cognitive Search | Microsoft Learn
- sshah145
Microsoft
How does ACS compare to Redis for vector Search. In terms of cost and performance. We have experienced better performance with Redis. Any plans to improve performance and can you share your views. How does search performance change when the DB size increases.- yahnoosh
Microsoft
Vector search performance is a very interesting topic. The Approximate Nearest Neighbors methods, being approximate, use different approaches to influence the speed/recall tradeoff. In the limit, you can achieve perfect recall by comparing the query vector to all vectors in the database but that’s obviously too expensive. To improve speed, ANN algorithms leverage compression techniques lower precision, or data structures to partition the indexed data to reduce the number of vector comparisons that need to be made. All of it to say that statements about performance should only be made at a specific recall target. Another dimension to this problem is price since you can choose to spend more on hardware to improve search latency and throughput. In an ideal benchmark, you’d pick configurations comparable on price and recall before you look at search performance since they are all related.
Both Cognitive Search and Redis use HNSW as the Approximate Nearest Neighbors algorithm which is one of the leading methods for applications optimizing for low latency and high recall and data that’s indexed incrementally and can change over time. The main differences in price/performance could stem from differences in implementation and other functionalities the service offers in addition to Vector search which are included in the price i.e., scaling, security, compliance, integration with other services, etc.
Hope this helps you reason about Vector search performance in Azure Cognitive Search and how it compares. We’re planning to publish results of our benchmarks which compares Vector search performance between different Cognitive Search SKUs and service topologies – how adding replicas and partitions changes Vector search performance profile. The basic intuition is that adding replicas improves throughput and adding partitions reduces latency.
- Pat BeahanBrass Contributoris there a way with Cogsearch to pick up the purview data classification tags when documents are index so when a prompt reply contains info from a document tagged restricted confidential the LLM/Response can embed in the response - data may be Restricted confidential with 100% reliability??
- fsunavala-msft
Microsoft
Hi Pat, you can leverage the Cognitive Search Indexer feature, to index documents from different data sources. https://learn.microsoft.com/azure/search/search-indexer-overview You can use the filter predicate pattern for security filter trimming in Cognitive Search, ensuring that results retrieved from your prompt are only accessible to users with access. See https://learn.microsoft.com/azure/search/search-security-trimming-for-azure-search I encourage you to test and evaluate these solutions thoroughly if you have company confidential or restricted data before launching to production.
- Jan_MeijeringCopper Contributor
We have implemented azure cognitive search for finding products based on their attributes and this works very well. Now a customer has the requirement to be able to search products based on the ETIM classification attributes. This classification has a predefined number of over 14.000 unique attributes accross all product categories. As each unique attribute will consume at least one simple field we run into the index limit of a 1000. Maybe we have chosen the wrong architecture, but what is the right one to use when we have such a requirement?
- mike_carter_msft
Microsoft
This appears to be a question about the 1000 field limit in Azure Cognitive Search. There isn't an easy workaround to the field limit, but you do have some options to get the experience you are looking for.
These are 3 options I would suggest considering:
- Field reuse within an index. If each product catalog entry would use less than 1000 fields, you could reuse the same fields for multiple product types with your application managing the mapping of the meaning of each attribute to the underlying field in the index. This option does have drawbacks since each field would have the same settings across multiple product types.
- Extend catalog to multiple indexes. You could effectively store your catalog in multiple indexes where the full set of fields will extend across the indexes. This would work best if an individual catalog entry doesn't have more than 1000 fields again and would run into some cross-index query issues. But you should be able to get faceting to work in this option.
-
Attribute metadata fields. Define a few general fields in your index to store attribute metadata. These fields could include "attribute_name," "attribute_value," and "product_id" or any other relevant identifiers. For each product, store its ETIM classification attributes as separate records in the attribute metadata fields. This means you'll have multiple attribute records for each product, one for each ETIM attribute associated with that product. To implement this solution, you'll need to update your data indexing and search query pipelines to work with the attribute metadata fields and apply filters accordingly. It may require some adjustments to your existing code, but it should provide a way to handle the complex attribute requirements.
- gpagliaCopper ContributorNot related to RAG but is there any plans to add Function Calling to Azure OpenAI?