Event banner
Azure Cognitive Search AMA: Vector search, Azure OpenAI Service, generative apps, plugins & more
Event Ended
Tuesday, Jul 25, 2023, 09:00 AM PDTEvent details
In this session we’ll answer questions about the emerging Retrieval-Augmented Generation pattern and how you can use Azure OpenAI service and Azure Cognitive Search to implement it today in your appl...
EricStarker
Updated Jul 25, 2023
Barry Briggs
Jul 25, 2023Copper Contributor
When using Azure OpenAI on your data, do the search terms retrieved from Azure Cognitive Search count toward the number of tokens and thus to the charge for Azure OpenAI? If so, is there any way for customers to predict or estimate the costs of using Azure OpenAI on your data?
- Barry BriggsJul 25, 2023Copper ContributorSuper helpful, thanks!
- robertlee-msftJul 25, 2023
Microsoft
Adding to Liam's answer, if you are only using vector search to retrieve documents, and not passing those documents to an Azure OpenAI prompt, then the usage of vector search by itself will not result in any calls to Azure OpenAI. Meaning, once you've generated the vector embeddings for your content (which will consume token counts in the embedding model/endpoint), there are no further costs beyond the search service itself. It's only when you pass the retrieved documents where additional costs can come into play. - liamca-msftJul 25, 2023
Microsoft
Hi Barry, yes, whatever is retrieved from Azure Cognitive Search and injected into the Azure OpenAI Prompt will count towards the token count (and thus the cost to execute that Azure OpenAI request. Pablo created a really good blog post that goes into more details on this. This is ultimately why the relevance of Azure Cognitive Search is so important and why we spend a lot of our time in making sure we can get the best results in the top 3-5 results to help minimize the number of tokens that are required. As for the cost, you will typically be "chunking" your data which should give you a good idea on the cost of any Azure OpenAI request as you can look at the typical number of tokens per chunk x the number of results you add to the prompt.
- Barry BriggsJul 25, 2023Copper Contributorthe link to Pablo's post seems to not be right?
- EricStarkerJul 25, 2023Gold ContributorHello - looks like there was a misprint in the URL which I've since fixed. Please try it again and let me know if it works for you.