Restricting Azure Cognitive Search & Azure OpenAI Output with Azure Entra Security Groups
Published Mar 15 2024 04:39 PM 1,750 Views
Microsoft

VinodSoni_0-1710544975699.png

 

Azure Cognitive Search & OpenAI Output can be effectively restricted with the help of Azure Entra Security Groups. With Azure Entra Security Groups, organizations can limit access to an Azure search instance or an OpenAI Output instance based on group membership of the user. This ensures that users only have access to the data within the scope of their job responsibilities. Azure Entra Security Groups also provide advanced authentication and authorization services for Azure services, offering additional layers of security for organizations to protect their data.

 

Azure OpenAI service is being used to create more interactive & intelligent chatbots. A key use case is being able to have the OpenAI service respond to user requests using your own data.

 

Why filter search results from Azure Cognitive Search

 

Cognitive Search is a search engine that catalogues all the documents, databases, etc. you provide it. However, there may be situations where you want an index of large amounts of data, but you don’t want every user in healthcare organization to have access to everything.

 

  • Protected Health Information (PHI) data
  • HR data
  • Classified data

For these situations, you need to adjust the search results based on the user's identity (The medical professionals, such as doctors, nurses, and other health care workers should have access to PHI data, while other people who are not involved or not authorized  should not see it).

 

With security filters, Azure Cognitive Search supports this use case. When you get search results, security filters let you give extra information to restrict results to only data the user can access.

There are three steps required to implement security filtering

 

  • Create an index that includes a field for security filtering (such as Azure Entra security group IDs)
  • Include which Azure Entra security group IDs are allowed to see the data on initial index of each document
  • Include the list of Azure Entra security group IDs that the user is a part of so the security filtering can be applied on each query

Create an index that includes a field for security filtering

 

A security filtering field is required when you create a Cognitive Search index. This field should be filterable and not retrievable.

 

 Example REST API call

 

POST https://[search service].search.windows.net/indexes/securedfiles/docs/index?api-version=2023-10-01-preview

{

     "name": "securedfiles", 

     "fields": [

         {"name": "file_id", "type": "Edm.String", "key": true, "searchable": false },

         {"name": "file_name", "type": "Edm.String", "searchable": true },

         ...

         {"name": "group_ids", "type": "Collection(Edm.String)", "filterable": true, "retrievable": false }

     ]

 }

 

Example C#

 

var index = new SearchIndex(options.SearchIndexName)

{

    Fields =

    {

        new SimpleField("file_id", SearchFieldDataType.String) { IsKey = true, ... },

        new SimpleField("file_name", SearchFieldDataType.String) { ... },

        ...

        new SimpleField("group_ids", SearchFieldDataType.Collection(SearchFieldDataType.String))

            { IsFilterable = true, IsHidden = true },

    },

    ...

};
await indexClient.CreateIndexAsync(index);

 

Include which Azure Entra security group IDs are allowed to see the data on initial index of each document

 

Each time a new document is uploaded & indexed, you need to include the list of Azure Entra security group IDs that are allowed to have this document in their search results. These Azure Entra security group IDs are GUIDs.

 

Example REST API call

 

{

    "value": [

        {

            "@search.action": "upload",

            "file_id": "1",

            "file_name": "secured_file_a",

            "file_description": "File access is restricted to the medical professionals, such as doctors, nurses",

            "group_ids": ["entra_security_group_id1"]

        },

        {

            "@search.action": "upload",

            "file_id": "2",

            "file_name": "secured_file_b",

            "file_description": " File access is restricted to the medical professionals, such as doctors, nurses, and other health care workers.",

            "group_ids": ["entra_security_group_id1", " entra_security_group_id2"]

        },

        {

            "@search.action": "upload",

            "file_id": "3",

            "file_name": "secured_file_c",

            "file_description": "File access is restricted to third parties and law enforcements",

            "group_ids": ["entra_security_group_id3", " entra_security_group_id5"]

        }

    ]

}

 

Example C#

 

var searchClient = await GetSearchClientAsync(options);

var batch = new IndexDocumentsBatch<SearchDocument>();

foreach (var section in sections)

{

    batch.Actions.Add(new IndexDocumentsAction<SearchDocument>(

        IndexActionType.MergeOrUpload,

        new SearchDocument

        {

            ["file_id"] = section.Id,

            ["file_name"] = section.SourceFile,

            ["group_ids"] = section.GroupIds

        }

     ));



    IndexDocumentsResult result = await searchClient.IndexDocumentsAsync(batch);

    ...

}

 

Provide the IDs of the Azure Entra security groups that the user belongs to so that each query can have security filtering applied to it.

 

For every query, add the Azure Entra security group IDs that the user belongs to (that are relevant to this application) to the list. Use an OData query to format this.

 

Example REST API call

 

POST https://[service name].search.windows.net/indexes/securedfiles/docs/search?api-version=2023-10-01-preview

Content-Type: application/json 

api-key: [admin or query key]

{

   "filter":"group_ids/any(g:search.in(g, ' entra_security_group_id1, entra_security_group_id2'))" 

}

 

Example C#

 

...

var filter = $"group_ids/any(g:search.in(g, '{string.Join(", ", user.Claims.Where(x => x.Type == "groups").Select(x => x.Value))}'))";

 }



 SearchOptions searchOption = new SearchOptions

 {

     Filter = filter,

     QueryType = SearchQueryType.Semantic,

     QueryLanguage = "en-us",

     QuerySpeller = "lexicon",

     SemanticConfigurationName = "default",

     Size = top,

     QueryCaption = useSemanticCaptions ? QueryCaptionType.Extractive : QueryCaptionType.None,

 };



var searchResultResponse = await searchClient.SearchAsync<SearchDocument>(query, searchOption, cancellationToken);

 

 My GitHub Reposiotry contains an example implementation (with security filtering using Azure Entra Security groups).

Co-Authors
Version history
Last update:
‎Mar 15 2024 04:39 PM
Updated by: