Auto label based on content matching by Information protection scanner

Copper Contributor

I have on premises repository in TBs. I have already configured information protection scanner and added repository where files are placed and my scanner is scanning the files also.  I want to auto label them based on content matching.

for example:

Auto label files as "Confidential" when there is a match of world "budget"

Auto label files as "Internal use only" when there is a match of word "leave request form" 

 

I know auto labeling is available for M365 for example exchange, ondrive and sharepoint. but How can I achieve above using information protection scanner.

Please help. Thanks

 

 

12 Replies

Hi, @securityxpert1122 

 

Thank you for posting your question here, I understand you're looking to apply labels automatically in your on-premises repositories through the MPIP scanner.

 

To do this, you will actually need to set the auto labeling option within the sensitivity label itself that you want to be applied and then make sure that label or the labels are assigned to your service account through a label policy.

 

Then when you configure the content scan jobs in the purview admin portal, I recommend leaving the label settings as the policy default.

 

You can read more on this here:

 

https://learn.microsoft.com/en-us/purview/deploy-scanner-configure-install?tabs=azure-portal-only#co...

 

 

@securityxpert1122 

 

Just following up to see if my initial reply helped with your question?

I followed exactly what you said but labels are not being applied.
Shall I create auto labeling policy also?

Hi, @securityxpert1122 ,

 

What is the below setting currently set to on your content scan job?

 

miller34mike_0-1692355369042.png

 

Thank you @miller34mike
It worked now.
I followed exactly what you said and then I granted write and modify permissions to scanner service account.

@miller34mike 

Hi Mike!

This is interesting. So I have the AIP Scanner installed and it is not labeling. I understand from following this thread that we need to add the Service Account to the auto label policies (with the SITs defined). My question though is where do we add the Service Account on this "choose locations" page...

 

for example, the UNC path I am trying to point to is I:\Security\AIP Scanner Test Data

 

Thanks for any guidance!

 

Luke_Michael_Fisher_0-1699473258368.png

 

Best regards,

Luke Fisher

@Luke_Michael_Fisher 

 

Hey Luke!

 

So, yes, you need to add the service account to the auto-labeling scope, but not that auto-labeling scope. You need to have one of your labels configured for auto-labelling and then have that label deployed to your service account through the label policy.

 

I'd recommend checking out this article for getting everything setup.

 

On-premises DLP with Microsoft Purview (cloudy-sec.com)

@miller34mike Thanks for the great write up in that link Mike! I am left wondering though if we can't use the auto labeling policies we set up already (screenshot below). They were created through the "Auto-Labeling" section of the Information Protection blade.

 

They have been running in pilot for a few months and we'd like to avoid having to scrap them and get back to creating the auto label policies through the labels themselves if possible. They are all designed to pick up SITs and auto label as Confidential. I'd like to point them to the on-premises file shares as mentioned above. Is this a possibility or are we back to the drawing board so to speak? 

 

Luke_Michael_Fisher_1-1699887891388.png

 

Luke_Michael_Fisher_0-1699887789664.png

 

I want to run the scanner in discovery mode only.
I have custom content types configured in label's auto label policy. I want to scan on-prem fileshare but I dont want to label them for now, just a scan result to see which content types are being matched with files stored in on-prem repository.
I have scanner account which has read-only rights on on-prem folders. Labels are published to that scanner account. Please guide how can I generated results without labeling the files. also are those results available on activity explorer?
I want to run the scanner in discovery mode only.
I have custom content types configured in label's auto label policy. I want to scan on-prem fileshare but I dont want to label them for now, just a scan result to see which content types are being matched with files stored in on-prem repository.
I have scanner account which has read-only rights on on-prem folders. Labels are published to that scanner account. Please guide how can I generated results without labeling the files. also are those results available on activity explorer?

Hi, @securityxpert1122 

 

I encourage you to read through this blog post I've linked below. It will run you through the full configuration of the scanner, including how to leverage it in discovery mode.

 

On-premises DLP with Microsoft Purview (cloudy-sec.com)

Hey, @Luke_Michael_Fisher 

 

The Auto-label policies you're referencing there cannot be point towards your on-premises files. Thos must be auto-labeled through the label itself. Those auto-label policies you're referencing are for cloud files/exchange online only.