Event banner
Azure Cognitive Search AMA: Vector search, Azure OpenAI Service, generative apps, plugins & more
Event details
In this session we’ll answer questions about the emerging Retrieval-Augmented Generation pattern and how you can use Azure OpenAI service and Azure Cognitive Search to implement it today in your applications to power ChatGPT-like experiences, generative scenarios, and more. Bring your questions about vector search in Azure Cognitive Search, which is coming to public preview soon, as well as about implementation details, data preparation, integration with large language models, and anything else related to Azure Cognitive Search.
An AMA is a live text-based online event similar to an “Ask Me Anything” on Reddit. This AMA gives you the opportunity to connect with Microsoft product experts who will be on hand to answer your questions and listen to feedback.
Feel free to post your questions anytime in the comments below beforehand, if it fits your schedule or time zone better, though questions will not be answered until the live hour.
116 Comments
- jankruseOccasional ReaderIn your Blog (https://techcommunity.microsoft.com/t5/azure-ai-services-blog/announcing-vector-search-in-azure-cognitive-search-public/ba-p/3872868) you are referring to Vector Search (using embeddings) for Audio. Which Services / APIs are provided for that?
- liamca-msft
Microsoft
That somewhat depends on what the use case is. For example, a very common one is to be able to convert audio to text (e.g. transcription) which is then searchable. For this there is Azure Speech Services (https://learn.microsoft.com/en-us/azure/ai-services/speech-service/overview) as well as the introduction of OpenAI Whisper (https://techcommunity.microsoft.com/t5/azure-ai-services-blog/openai-whisper-is-coming-soon-to-azure-openai-service-and-azure/ba-p/3876671). Once it is in text format you can then use typical text models such as Azure OpenAI Ada 002. There are also numerous models that can be used to create embeddings based on other audio types.
- Adam KochCopper Contributor
Introduction: This is Adam Koch and Todd Meinershagen from Paycor. We are in technical roles working to help deliver some prototype AI features into our application suite.
Some topics on our team's mind in preparation for the live session:- Multi-tenancy 1: Are there any formal recommendations on having multi-tenant Cognitive Search-LLM via Azure AI Studio? (beyond having a full instance per tenant)
- Multi-tenancy 2: We are proofing the idea of having multiple indexes in a single cognitive search resource - one for each of our customers. We would then have a single LLM that would process the prompt along with the results of the particular index search based on the customer. Are there any limits to the number of indexes within one Search resource? Are there any risks or challenges we should be aware of in using this approach?
- In all of the samples, the pattern leverages a Blob Container with documents that are indexed with the index being automatically set up by the Open AI Studio. We are wondering how we would do that from a straight code/automation perspective. What are the commands/sdk that we use to create a new index for a Blob Container that pulls out the correct 5 pieces of metadata?
- Since Azure Cognitive Search can handle databases and json data - Does Search + Azure OpenAI also support pure data from Sql Server or json documents? Or are documents (Word, PDF, etc.) the only items supported in that scenario?
- What is the difference between the regular search and the higher priced semantic search with regards to the RAG pattern?
- robertlee-msft
Microsoft
I can help with #5.
What is the difference between the regular search and the higher priced semantic search with regards to the RAG pattern?
RAG pattern consists broadly of two steps:
[the summary of RAG pattern below, number points 1 and 2 are sourced from https://vitalflux.com/retrieval-augmented-generation-rag-llm-examples/]
1. Retrieval Phase: Given an input query (like a question), the RAG system first retrieves relevant documents or passages from a large corpus using a retriever. This is often done using efficient dense vector space methods, like the Dense Retriever (DPR), which embeds both the query and documents into a continuous vector space and retrieves documents based on distance metrics.
2. Generation Phase: Once the top-k relevant documents or passages are retrieved, they are fed into a sequence-to-sequence generator along with the original query. The generator is then responsible for producing the desired output (like an answer to the question) using both the query and the retrieved passages as context.
During the retrieval phase, the candidate documents that are returned will directly affect the generation phase, as the quality of that phase will only be as good as the input documents and the completion model.
As a result, it is beneficial to improve the quality of the retrieval phase. This is where vector search and/or semantic search can improve on this RAG pattern. Both features (either used together or only using one or the other) can return more semantically relevant information than traditional keyword search, because they are searching based on the meaning of the search query and candidate documents and doesn't require keyword matches and term frequency, document length, term saturation, etc that TF IDF and BM25 keyword search techniques would use.
- EricStarkerFormer EmployeeAs a heads-up, we will have an official Microsoft response to this question during the event!
- Thomas BrownCopper ContributorThe For multi-tenancy scenario is one for my group building as well. Perhaps a future blog or Channel 9 or ...
- bfglawrenceCopper Contributor
For multi-tenancy, Azure Cognitive Search has a few common patterns when modeling a multitenant scenario. One index per tenant: Each tenant has its own index within a search service that is shared with other tenants. One service per tenant: Each tenant has its own dedicated Azure Cognitive Search service, offering the highest level of data and workload separation.
Regarding multiple indexes in one resource, Azure Cognitive Search can import, analyze, and index data from multiple data sources into a single consolidated search index. You can use multiple indexers in Azure Cognitive Search to create a single search index from files in Blob storage, with additional file metadata in Table storage. You can also configure an indexer that imports content from Azure Blob Storage and makes it searchable in Azure Cognitive Search.
To create an index for a Blob Container that pulls out the correct 5 pieces of metadata, you can use the deploy-index.json file which defines the structure of the search index. It includes the typical information from blob storage (the content as well as file name, full path, file size, etc.).
Azure Cognitive Search supports pure data from SQL Server or JSON documents. It also supports documents (Word, PDF, etc.).
The difference between regular search and semantic search with regards to the RAG pattern is that semantic search uses natural language processing (NLP) to understand the meaning behind words and phrases. It can identify synonyms and related concepts to expand queries and improve relevance. Regular search uses keyword matching to find relevant documents.
- AdrianHillsCopper Contributor
Looking forward to this session. I have a few questions that it would be great to have covered:
- I'm using the RAG pattern currently with ACS semantic search to find relevant content to include in chat prompts. What is the difference that vector search will bring and when would you choose one over the other?
- With vector search, will vectorisation of content need to be done externally before passing in to ACS?
- Are there/will there be updated samples to show the power of vector search in ACS?
- CosmosDB for MongoDB Core offers vector search - any guidance on when you might choose ACS vs CosmosDB for that capability?
Thanks in advance!
- bfglawrenceCopper ContributorVector search is a new feature in Azure Cognitive Search that is currently in public preview. It is designed to provide more advanced search capabilities by using vectorization techniques to represent documents and queries as vectors in a high-dimensional space. This allows for more efficient and accurate matching of similar documents and queries based on their semantic meaning and context. Vector search can be used in conjunction with the RAG pattern to provide more advanced question-answering capabilities and improve the accuracy of search results. With vector search, you would need to perform vectorization of content externally before passing it into Azure Cognitive Search. This can be done using various techniques, such as word embeddings or deep learning models, depending on the specific requirements of your scenario. There are updated samples available that demonstrate the power of vector search in Azure Cognitive Search. You can refer to the Azure Cognitive Search documentation and the SDK documentation for code samples and guidance on vector search implementation and management. Regarding CosmosDB for MongoDB Core offering vector search, it’s important to evaluate the specific requirements and constraints of your scenario to determine which solution is best suited for your needs. Azure Cognitive Search provides a fully managed, scalable, and flexible search service that can handle various types of data and workloads. It also integrates seamlessly with other Azure services, such as Azure OpenAI Service, to provide more advanced natural language processing capabilities. CosmosDB for MongoDB Core provides a fully managed NoSQL database service that can handle various types of data and workloads. It also provides built-in support for MongoDB APIs and features, such as sharding and replication.
- liamca-msft
Microsoft
Thanks Blair for the great response! I would like to add that Azure Cognitive Search not only offers Vector Search, but also Hybrid Search which leverages scores from traditional text search as well as vectors, which we (and much of the industry research) has found to offer more effective relevance than just Vector Search. In addition, when you then add our Semantic Search (which is a reranking layer), we find this typically offers the most effective relevance, which is incredibly important, especially when build enterprise ChatGPT apps. We are working on a blog post around the effectiveness of this, so please keep your eye out over the next few weeks here.
- Faddykenny_09Copper ContributorLooking forward to the event. It will be a great opening for the community
- ivanatilcaBrass ContributorHello team, is this the event we have scheduled for today? Because i still see the PGI in my calendar. Thanks
- EricStarkerFormer EmployeeHello! I'm afraid I'm not aware of what event you have scheduled for today, but this is an event happening on July 25 at 9AM PT, so I imagine not.
- ivanatilcaBrass ContributorSo sorry, we had a similar session in the Microsoft MVP Team. I got confused. Thanks!
- EricStarkerFormer EmployeePlease note the date change to July 25! Thank you.
- ampm99Copper ContributorSame time 9am Pacific on July 25?
- Laziz_Turakulov
Microsoft
Yes, please see the calendar invite attachment above.
- UUP2020Copper ContributorWill there be an event video posted for those who prefer watching it after?
- EricStarkerFormer EmployeeHello! Since there is no video content here - it's a purely text-based event - that wouldn't be possible. But you'll be able to see the questions and answers - which will be in a text form - at any time. We'll also post a summary of the questions and answers after the event.
- HaraldGCopper ContributorWhat information is stored within the Open AI Service. If a user asks a question or provides information that includes PII data or special category data that fall under GDPR regulations, how is that data handled? Is the data used to train the system? Is there any way this type of data could be redacted when stored?
- DerekLegenzoffFormer Employee
Hi Harald - please take a look at this link for more details on data, privacy, and security with the Azure OpenAI Service.
- Rick_KotlarzFormer Employee
Unlike the ChatGPT website, data that is sent to the Azure OpenAI API endpoints is not (by default) used for Reinforcement Learning from Human Feedback (RLHF) and as such does not get added back to the GPT foundational models.
To use Azure OpenAI, you first need to create that resource within the Cognitive Services blade along with selecting the region. Data processed in that region meets the same regulatory requirements as other services within that region. Beyond this scaffolding, it's up to the resource owners to build services that meet applicable regulatory requirements. Azure also provides the same type of logs for Azure OpenAI as it does other services which leverage Azure Monitor.
Reference URLs:
How your data is used to improve model performance | OpenAI Help Center
Azure Cognitive Services security - Azure Cognitive Services | Microsoft Learn
Monitoring Azure OpenAI Service - Azure Cognitive Services | Microsoft Learn
- Pat BeahanBrass Contributorand note New OpenAI LZ architecture published - https://techcommunity.microsoft.com/t5/azure-architecture-blog/azure-openai-landing-zone-reference-architecture/ba-p/3882102
- HaraldGCopper ContributorCurrently the Azure Open AI Service is not available in a UK region. When will it become available in a UK region?
- Laziz_Turakulov
Microsoft
It's available now. You can deploy gpt-35-turbo in "UK South" Azure region as of yesterday: https://learn.microsoft.com/en-us/azure/cognitive-services/openai/concepts/models#gpt-3-models-1.