RAG with Azure AI: why your retrieval strategy matters AMA

63 Comments

gyangupta
Copper Contributor
Feb 14, 2024
What are the different Vector Databases available in the Azure platform?
- TyBecker
  Microsoft
  Feb 14, 2024
  (in no particular order) 1. Azure Postgres Flexible Server (it has pgvector and azure_openai extensions to support) 2. Azure Cosmos DB Postgres and Mongo vCore 3. Azure AI Search (has robust semantic search that can be used in conjunction with vector search, unless your data already exists in one of the above databases this is the most common use pattern)
- gyangupta
  Copper Contributor
  Feb 14, 2024
  Along with LLM model size, what is the limitation for Vector DB size to get optimal performance?
  - gia_mondragon
    Microsoft
    Feb 14, 2024
    The vector size limits in AI Search can be found here: https://learn.microsoft.com/en-us/azure/search/vector-search-index-size. Limits are set based not only on technical limitations (depending on the limit) but also based on performance testing. However, to obtain optimal retrieval performance from your RAG app, you can take a look at best practices listed here: https://techcommunity.microsoft.com/t5/ai-azure-ai-services-blog/azure-ai-search-outperforming-vector-search-with-hybrid/ba-p/3929167
JustinR296
Occasional Reader
Feb 14, 2024
Hello This is Justin joining from Texas! In the Azure Open AI studio, are there plans to add functionality to use more than one data source type. For example, users can use URL and Blob storage?
Senthil Gopal
Copper Contributor
Feb 14, 2024
How to join the event?
- Kumar Chinnakali
  Copper Contributor
  Feb 14, 2024
  It's chat live event, https://techcommunity.microsoft.com/t5/artificial-intelligence-machine/rag-with-azure-ai-why-your-retrieval-strategy-matters-ama/ec-p/4043897#M250
- EricStarker
  Former Employee
  Feb 14, 2024
  Hello! You've already in the event. The event happens in the comments here, all text-based. Sorry if that wasn't clear! Feel free to post any questions as comments to this event.
  - Senthil Gopal
    Copper Contributor
    Feb 14, 2024
    Thanks Eric. I was posted my question.
EricStarker
Former Employee
Feb 14, 2024
Welcome to the RAG with Azure AI: why your retrieval strategy matters Ask Microsoft Anything!

This live hour gives you the opportunity to ask questions and provide feedback directly to the Azure AI Search team.

Please post any questions in a separate, new comment thread.

To start, introduce yourself below and tell us where you're logging in from!
- Cacrowley
  Occasional Reader
  Feb 14, 2024
  Amanda Crowley South Carolina
- Abhilash_G_R
  Occasional Reader
  Feb 14, 2024
  Hello Eric! I am Abhilash GR! Greetings of the day. I am from Bengaluru, India.
Steve Jones
Copper Contributor
Feb 14, 2024
How would we get started with a public set of data, say on a public website, as opposed to building something that might be private/semi private with data for authorized users? (internal or customers)
- fsunavala-msft
  Microsoft
  Feb 14, 2024
  Here's a high-level flow you can follow to get started:
  
  Identify the public data source: Identify the public website or dataset that you intend to use. Ensure the data is publicly accessible and adheres to legal and ethical guidelines regarding its use.
  
  Data Acquisition: If the data is on a website, you might need to use web scraping techniques to extract the data. Alternatively, if the website offers an API, you can use it to fetch the data more efficiently in a structured format. Additionally, if you are using a dataset from a catalog like Hugging Face, they have clear instructions for downloading the data.
  
  Prepare the data: It’s a good practice to clean and preprocess the data to ensure it is in a suitable format for indexing. This usually includes removing any unnecessary info, converting data formats, whitespace cleanup, etc. Then, you’d want you to define a schema for you Azure AI Search index that matches the structure of your data. This includes specifying fields and their data types, as well as configuring any searchable, filterable sortable, or facetable attributes.
  
  Indexing the Data: You can use the Azure Portal, Azure CLI, or Client SDKs, to create a new search index based on the schema you defined. You can ingest data into Azure AI search in two ways: Push API or Pull via Indexer. Data import and data ingestion - Azure AI Search | Microsoft Learn
  
  Query Your Index: You can now begin to create search experiences by searching your AI Search index. You can do simple keyword full-text search queries, vector search queries, or hybrid search queries. Search Documents - Azure AI Search | Microsoft Learn
  
  Access Control and Privacy for Public Data: Since the data is public, you might not need stringent access controls but if you need them, you can implement RBAC and leverage security filter trimming. Security overview - Azure AI Search | Microsoft Learn and Security filters to trim results using MIcrosoft Entra ID - Azure AI Search | Microsoft Learn
  
  Hope this helps!
Cacrowley
Occasional Reader
Feb 14, 2024
Don't have a PC anymore don't think I could be much assistance to this event thanks Amanda Crowley
CPS
Occasional Reader
Feb 14, 2024
Context:
The intention is to leverage the Azure OpenAI Chat
With the following Properties for the deployment:
Model name: gpt-4-32k
Model version: 0613
Version update policy: Once a new default version is available.
Deployment type: Standard
Content Filter: Default
Tokens per Minute Rate Limit (thousands): 30
Rate limit (Tokens per minute): 30000
Rate limit (Requests per minute): 180
We configured a data source that is based on structured data (Azure Search Service with an Index that has Semantic Search configured). In our case is a list and the corresponding details for Micro Credentials offered by Higher Education Institutions. The dataset we tested is not large, about 2000 records and about 3 MB of data in total.

Questions:

Q1: We need to have one source with structured data and one that is a BLOB Storage with PDF files. The PDF files are meant to offer guidelines to the Azure OpenAI Chat. How can we add more than one data source?

Q2: How to get around the way some of the responses are formulated, often the response starts with “Based on the retrieved documents, the institutions that ….” Ideally will be to say “Based on my knowledge base, the institutions that…”

Q3: We run into functionality issues for basic questions (see screenshot) where Azure OpenAI is not able to retrieve a complete list even though is not an extensive one even though the data source was set to not have data content limits. NOTE: in the OpenAI custom ChatGPT the results returned are correct.

Q4: All the responses to questions that require some analytics (nothing complicated just Counts) are returning incorrect results. NOTE: in the OpenAI custom ChatGPT the results returned are correct.

Q5: One of our requirements is to allow a user to upload a file as part of their request (in our case the user will upload a brief resume file and the Azure OpenAI Chat is expected to quickly analyze it and return a relevant list of Micro Credentials). NOTE: this functionality is available in the OpenAI custom ChatGPT.

Q6: How can we get around quota limitations in Azure OpenAI Service?

Q7: Are there any limitations on Azure Search Service side?

Q8: We were not able to create an Index for an Azure Search Service that relies on JSON files. It gets stuck on the last step when the indexer is created, just displays “Validating” and never gets out from that state.
- gia_mondragon
  Microsoft
  Feb 14, 2024
  Q1: The Azure OpenAI “on your data” feature from the Azure OpenAI PlayGround (https://learn.microsoft.com/en-us/azure/ai-services/openai/use-your-data-quickstart?tabs=command-line%2Cpython&pivots=programming-language-studio) you’re only able for now to add a single data source at a time. However, there are other options to get the data into the AI Search index so you can use the index directly in that feature. From the AI Search end, you could use Integrated vectorization - Azure AI Search | Microsoft Learn to chunk and vectorize files from different Blob containers and use a single index as a target, then you can use the Azure OpenAI on your data feature and use that index accordingly. The number of indexers you can have in a single instance is limited by the SKU you use: https://learn.microsoft.com/en-us/azure/search/search-limits-quotas-capacity Q3: If you’re using Azure OpenAI “on your data” functionality, you can control the number of documents retrieved in the advanced options: Q4: We would need to understand the scenario better, where are you asking the questions (in which console/system)? What kind of questions are you asking? What is in your documents to help the LLM answer the question? This would help us with the next steps to answer this properly. Thanks. Q8: The first run of an indexer may take even multiple hours while running, depending on the size of the documents and the number of the documents in the blob container. If the creation state is what taking long, this may be expected based on that. However, you should be able to start searching your index with the documents already indexed.
  - CPS
    Occasional Reader
    Feb 14, 2024
    Re. Q4, we were asking the questions from the basic "Contoso" chat application generated and deployed by the Studio. Example of question: "How many micro-credentials are available from University of Toronto? The chatbot responds with 5, and we know that there are 210 in the dataset that we indexed. (If we ask the same question in our Custom GPT with the same dataset it responds correctly.) Note that we are using a structured dataset (CSV), not a bunch of loose documents. However, since your examples and documentation are mostly around indexing documents, we even created separate files (one per CSV row) and included a document with statistics about the dataset to try to help it along, but it didn't help.
- danquirk
  Former Employee
  Feb 14, 2024
  Q2: Prompt engineering is the component of Retrieval Augmented Generation with Azure AI Search that gives you the ability to influence the formulation of output responses. The Azure OpenAI Service has content on prompt engineering (ranging from introductory to advanced) to help you with this topic: Azure OpenAI Service - Azure OpenAI | Microsoft Learn
  - allisonsparrow
    Microsoft
    Feb 14, 2024
    + 1 to Dan's comment - Fine-tuning is also a method that's effective in changing the LLMs tone/manner of speaking: "Good use cases for fine-tuning include steering the model to output content in a specific and customized style, tone, or format, or scenarios where the information needed to steer the model is too long or complex to fit into the prompt window." https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/fine-tuning-considerations
- fsunavala-msft
  Microsoft
  Feb 14, 2024
  Q6: Quota limits exist for capacity reasons and to maintain the health of your service. For further information on quota limitations, please visit the Azure OpenAI Service documentation: Azure OpenAI Service quotas and limits - Azure AI services | Microsoft Learn. Additionally, you can find how to manage your quota here: Manage Azure OpenAI Service quota - Azure AI services | Microsoft
  - CPS
    Occasional Reader
    Feb 14, 2024
    Re. Q6, we are hitting the limit with just two human users doing some basic and simple testing in the "Contoso" Chatbot created and deployed by the Studio. The index was created from a 2000 record CSV, i.e. not a big dataset. This would make it very unusable for a production environment accessible to the public, even if it has only a few visitors.
mmcenroemicrosoftcom
Microsoft
Feb 08, 2024
How will we connect to this event, will the link be on this page ?
- EricStarker
  Former Employee
  Feb 08, 2024
  You're already at the right page. There's no additional link to go elsewhere. You'll post your questions here in these comments at any time and the team will answer live during that hour, all text-based. Additionally though, since you are a Microsoft employee, if you'd like me to connect you to their team to ask questions internally, let us know, as this is primarily meant for external Tech Community users to ask questions.
  - KEVINDIBB
    Copper Contributor
    Feb 14, 2024
    I think what he's asking is will there be a link to a Teams meeting or some other link on this page to join the call?
jaymcc510
Iron Contributor
Feb 02, 2024
awesome
AubinBakana
Copper Contributor
Feb 02, 2024
Thank you for the invite.

Event details