Build specialized AI assistants with RAG

Former Employee

Jun 26, 2024

The term retrieval augmented generation, or RAG for short, has emerged as a hot topic in AI discussions. Let's dive into what RAG is and learn how you can leverage it with just a few lines of code or clicks to effortlessly create powerful AI apps.

RAG uses information from additional sources to enhance the output of a large language model (LLM). If you’ve used LLMs like ChatGPT or Microsoft Copilot, you’ve probably gotten your fair share of hallucinations, responses that are wrong, often to the point of silliness. In fact, Microsoft Copilot comes with a warning: “Copilot uses AI. Check for mistakes.” RAG grounds the responses in authoritative external knowledge making them more accurate and useful.

Understanding RAG

At its core, RAG is a straightforward concept. The technique involves retrieving relevant information from additional data sources and incorporating it into the prompt alongside your query. RAG grounds your model resulting in more useful responses. A business might use its product catalog, FAQ, or shipping policy as additional information for a customer service assistant. Here’s an example of fictional company Fuzzy Friends of Endor. A customer bought one too many Ewogs – but what’s the return policy? By augmenting Fuzzy Friends’ customer support assistant with their FAQs, it gives the correct answer.

You can upload a document along with your prompt in Microsoft Copilot to get customized responses.

Expanding the Horizons of RAG

RAG works with small data sources like the above FAQ, but it can handle large documents just as well. It can work with PDFs, text documents, images, databases and even web APIs. Fuzzy Friends could ground their assistant with real time tracking information by giving it access to the USPS tracking API. You can use an image as data source for RAG.

RAG in Practice

In Microsoft Copilot, you can upload documents alongside your prompt, sort of building an instant RAG. It’s what I used for the examples above. If you’re building AI apps with Azure OpenAI, follow the prompts in Azure OpenAI Studio to use RAG. Click “Bring your own data” in the Azure OpenAI Studio and select your choice in the drop-down menu.

To use this data source for RAG in your app, you need to add a few lines of code. You can find the full Python code for Semantic Kernel, an AI Orchestration SDK, on my GitHub.

from semantic_kernel.connectors.memory.azure_cognitive_search.azure_ai_search_settings import AzureAISearchSettings

azure_ai_search_settings = AzureAISearchSettings( 
endpoint="...",  
index_name="...",  
api_key="...") 

az_source = AzureAISearchDataSource.from_azure_ai_search_settings(azure_ai_search_settings=azure_ai_search_settings) 
extra = ExtraBody(data_sources=[az_source])

Simply adding a SearchDataSource to the code gives my app almost magical powers, it now knows information specific to its job like the company's FAQs - turning it into a helpful customer support assistant.

main $ python fuzzy_friends.py
Welcome to Fuzzy Friends customer support!        
  How may I help you?
User:> Can I return my Ewog?
Fuzzy Friends customer service:> Yes, you can return your Ewog. Fuzzy Friends of Endor offers a 
full refund or exchange if you are not satisfied within the first 30 days of adoption [doc1].

Gone are the days when creating a specialized AI assistant required the expertise of highly trained professionals. Today, with the advent of modern tools, anyone can build one with just a few clicks or lines of code. This democratization of technology is set to expand the horizons of AI's capabilities and revolutionize our relationship with technology. I'm thrilled about the potential changes this shift will bring to our everyday lives. Please share how you use RAG!

Updated Jun 26, 2024

Version 3.0

fhinkel

Former Employee

Joined June 20, 2024

View Profile

Educator Developer Blog

Follow this blog board to get notified when there's new activity