This is a step-by-step guided walkthrough of setting up the open-source, Microsoft Purview Data Loss Prevention (DLP) incident management solution for Microsoft Sentinel. Three years ago, we presented the initial version of the connector.
My colleague @Alex_Anders has added slipstreamed deployment and updated the store code to use Data Collection Rules. Another great addition is a Log Analytics function that helps with normalization of the queries. We have improved the code to make use of new and enhanced Sentinel features.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.
The data collection rules (DCR) integration is making use of the following components.
New Built-In Features
- Fully packaged into a single ARM/Bicep deployment for easy installation and setup.
- Leverages new Data Collection Rule (DCR) custom tables to ingest and store DLP data. This provides finer grained security and unlocks new capabilities such as data transformations.
- Provides option to hash, remove, or retain the detected sensitive information values.
- Includes "PurviewDLP" Azure Monitor Function to normalize the DLP event data across all of the different workload types (Endpoint, Teams/Exchange, and SharePoint/OneDrive).
- Separates DLP data into the below three separate tables to allow for all sensitive information data to be ingested (some events would exceed the max field size when trying to store everything in a single row). This also allows for more flexible queries and restricting access to the sensitive information data if desired.
- PurviewDLP: Core DLP event information, including violating user, impacted files/messages, etc.
- PurviewDLPSIT: Contains the sensitive information types that were detected.
- PurviewDLPDetections: Contains the sensitive information type detected values (evidence).
- For Endpoint DLP events, the severity of the alert/event is not currently included in the API, so by default the severity is derived from DLP policy rule name. The rule name must have a "Low", "Medium", or "High" suffix value with a space as the delimiter. For example, "DLP rule name Medium" or "DLP rule name High".
- Includes 3 built-in Sentinel workbooks to provide advanced incident management and reporting:
- Microsoft DLP Incident Management
- Microsoft DLP Activity
- Microsoft DLP Organizational Context
- Includes two options for automatically deploying the built-in Sentinel analytics rules.
- A single rule to create alerts and incidents across all DLP workload types. This will work for most environments where the 150 events per 5 min. limit is not being exceeded.
- A rule for each Purview DLP policy and workload (DLP Policy Sync). This is to be used in scenarios where the 150 events per 5 min. limit is being exceeded or where more customization is desired based on workload.
- The syncing of the sensitivity label information and analytics rules now uses modern authentication mechanisms.
- Better error handling has been introduced to the code along with a more hardened configuration for the Azure components. For example, secrets are now stored in a Key Vault with restricted access from the Function App.
Components Included in Solution/Deployment
- Function App with all of the dependencies (i.e., Storage Account, Key Vault, Application Insights, etc.) and code necessary to ingest the DLP events, sensitivity label information, and advanced Sentinel analytics rules (if desired).
- Azure Monitor Custom Tables to house the core DLP events along with the sensitive information data.
- Azure Monitor Function to parse and normalize the DLP event data across all of the different workload types (Endpoint, Teams/Exchange, and SharePoint/OneDrive)
- Azure Monitor Data Collection Rule and Data Collection Endpoint required to ingest the DLP events via the new Azure Monitor Logs Ingestion API.
- Sentinel Analytics Rule(s) to automatically start turning the raw DLP events into actionable alerts and incidents within Sentinel. The appropriate entity mapping is also pre-configured.
- Sentinel Workbooks to help with advanced DLP incident management and reporting.
-
Sentinel Watchlists to house sensitivity label information and to help with the analytics rule "DLP Policy Sync" feature if enabled.
Prerequisites
- License requirements for Microsoft Purview Information Protection depend on the scenarios and features you use. To understand your licensing requirements and options for Microsoft Purview Information Protection, see the Information Protection sections from Microsoft 365 guidance for security & compliance and the related PDF download for feature-level licensing requirements.
- Sentinel workspace Azure RESOURCE ID (Not the WORKSPACE ID) that the solution will ingest data into and provision the associated Sentinel artifacts (i.e., analytics rules, workbooks, function, etc.).
- Owner permissions on the above Sentinel workspace.
- Global Admin permissions on the Purview DLP Entra ID tenant to create the App Registration and grant Admin Consent.
- Owner permissions on an Azure Resource Group or Subscription to deploy the solution to. If Owner permissions are not granted on the subscription, the Microsoft.ContainerInstance resource provider must be registered on the subscription before deployment in order for the code to be automatically deployed to the Function App.
Prepare for Deploying the Solution
- Register a new App Registration in Entra ID, assign the following API permissions, and grant admin consent:
- Microsoft Graph (Application permissions)
- Group.Read.All
- User.Read.All
- InformationProtectionPolicy.Read.All (If you want to sync Sensitivity label details)
- Office 365 Management APIs (Application permissions)
- ActivityFeed.ReadDlp (Needed for detailed DLP events)
- Generate a new secret for the App Registration and save, along with the Client ID and Tenant ID which will be used in step 6 of the deployment.
- Create a Resource Group (if one does not already exist) where you have the Owner role for the deployment of the Function App and its dependencies.
- Create the Azure Sentinel Workspace (if not already created) where you want to ingest the data.
-
- Under Workspace Settings select the JSON View and copy the Resource ID to be used in step 9 of the deployment below
- Enable the Microsoft 365 (Formerly Office 365) Sentinel Connector and ensure the OfficeActivity table is provisioned if you want further alert enrichment for SharePoint events.
You now have enough data to start the deployment.
Deployment
Access the code from this location O365-ActivityFeed-AzureFunction/Sentinel_Deployment at master · OfficeDev/O365-ActivityFeed-AzureFun....
Click
- Update Subscription, Resource Group, Region
- Provide the function app name all-in lower-case characters. Use a name that is globally unique.
- Deploy Application Insights for the ability to correctly monitor the solution.
- Provide a name for the Key Vault holding the secrets (It will store the sensitive variables), use a name that is globally unique.
- Provide a Storage Account name that is globally unique and follows naming rules of storage accounts.
- Provide the Tenant ID, Client ID, and Client Secret collected in the pre-requisites.
- Provide the Exchange domain names that are internal to your tenant. This is used to determine the internal source of email messages.
- Optionally, update the DCE/DCR names.
- Paste the Sentinel/Log Analytics Workspace Azure Resource ID collected in the pre-requisites.
- DLP Policy Sync is a feature that will create additional rules to cater for environments with a lot of DLP events. This also allows you to easily disable Analytic rules for rules that generate too much noise. False will work for most environments.
- Deploy the workbooks, they will help you to get insights from the DLP events.
- Deploy the function code it will provide a slipstreamed deployment. If the user running the deployment does not have Owner permissions on the target subscription, the Microsoft.ContainerInstance resource provider must be registered on the subscription before deployment in order for the code to be automatically deployed to the Function App.
- Depending on data handling requirements you can select to only keep a hash of the data or remove the sensitive data being matched. You can also keep the matching data so that it can be displayed in Sentinel.
- Choose an option for Endpoint Severity in Rule Name. For Endpoint DLP events, the severity of the alert/event is not currently included in the API so, by default, the alert/event severity is derived from DLP policy rule name. The rule name must have a "Low", "Medium", or "High" suffix value with a space as the delimiter. For example, "DLP rule name Medium" or "DLP rule name High". If "False" is selected, all alerts/events will have a Medium severity.
- IMPORTANT: The following artifacts get deployed to the Sentinel/Log Analytics workspace. If artifacts of the same type and name already exist, they will be overwritten:
- Log Analytics function named "PurviewDLP"
- Watchlists named "Policy" and "SensitivityLabels"
- Custom Tables named "PurviewDLP", "PurviewDLPSIT", and "PurviewDLPDetections".
- Workbooks named "Microsoft DLP Incident Management", "Microsoft DLP Activity", and "Microsoft DLP Organizational Context".
- Click Review + Create.
When complete you should see what is shown in the screenshot of the Resource Group below.
Advanced Considerations
If you want to customize the grouping of incidents or make other changes like amendments to the queries to include data from other Watchlists. Modify the Templates seen below.
To force a resync of all existing rules if you are using Policy sync, go to advanced tools.
Select the workload you want to resync, Exchange= EXOTRuleprocess.log, SharePoint= SPODRuleprocess.log, EndPoint= EndpointRuleprocess.log
You can either remove just one line to test to resync only one rule. Or remove all lines to resync all analytic rules.