We’ve seen huge interest from organizations that want to use Azure OpenAI service to access Large Language Models (LLMs) in combination with their own data. Allowing these applications to access your organization’s knowledge base allows inclusion of data relevant to the conversation, creating a richer and more useful experience. However, this introduces new problems if the Generative AI application isn't aware of any access control requirements. We’ve recently updated the Cognitive Search OpenAI Demo to allow user login and access control, which enables the Generative AI application to tailor responses on a per-user basis.
Normally, you can just ask ChatGPT general questions and get good responses. However, this approach breaks down when you ask ChatGPT specific questions about your organization. When I ask, “What is included in my Northwind Health Plus plan that is not in standard”, I get a much less useful “I’m sorry, but I don’t have access to your health plans” response. If only there was some way to combine data from your organization with ChatGPT! Fortunately, this is possible using a Retrieval Augmented Generation (RAG) approach.
Cognitive Search is an AI-powered information retrieval platform that allows you to combine LLMs with your organization’s data using the RAG approach. Documents from your organization’s knowledge base are chunked and embedded using an Azure Open AI embedding model. The embeddings are then indexed in Cognitive Search alongside the document text.
Cognitive Search combines multiple search methods to improve your results. Keyword search over the document text allows matching specific terms in your documents. Vector search goes a step further and finds the sections of documents that are semantically similar to your search query. The search results from both steps are combined using a hybrid approach called Reciprocal Rank Fusion (RRF). Finally, semantic ranking leverages the power of machine learning models from Microsoft Bing to further improve these hybrid search results.
Not all documents in your knowledge base are meant for public consumption outside of your organization. Sales reports, competitive research, or other sensitive documentation might not even be visible to all members of your organization. If you create a simple application that allows you to chat with all the data in your knowledge base, you might inadvertently be exposing your documents to the wrong audience.
In general, any access control solution requires two components:
Before we dive into the specifics of our solution, it’s important to have a general understanding of how identity and access management works in Azure. One of the main concerns when deploying a cloud-based application is ensuring your users can securely access it. Microsoft Entra ID (formerly Azure Active Directory) is Azure’s identity and access management solution, facilitating secure access to external and internal resources for your organization. Here’s a brief overview of Microsoft Entra ID terminology:
Authentication and authorization in Microsoft Entra ID are implemented using the OAuth 2.0 and Open ID Connect protocols.
Now that we understand the basics of identity and access management, let’s see how we can enhance the security of our Generative AI application.
To explain how integration works, we'll be using this demo repository, which anyone can deploy as long as they have the follow prerequisites:
Without any identity or access management, the demo uses the following architecture:
When we add identity and access management the demo architecture changes:
Let’s walk through exactly how the demo integrates with Microsoft Entra ID.
The steps to set up the demo are documented in the repository. Here’s how the setup steps are used to integrate with Microsoft Entra ID at a high level:
The following diagram illustrates how the single-page application interacts with the API server and integrates with Microsoft Entra ID:
groups/any(g: search.in(g, 'x'))
oroperator. For example, to match documents where either the user id or the group id is present in the access control fields, the filter would be
groups/any(g: search.in(g, 'x')) or users/any(g: search.in(g, 'y'))
The following diagram demonstrates how the API server uses filters to retrieve documents from Cognitive Search that match the permissions of the logged-in user:
Combining Generative AI and access control can unlock a myriad of new use cases that enhance security, compliance, and productivity. We invite you to explore this cutting-edge technology by deploying our sample application.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.