Forum Discussion
Best practice for enabling policies in Production
Hi,
we have around 30 keywords that were used in auto labelling policies to perform simulation on MS teams, Exchange, Sharepoint and One drive. There is enormous amount of data that is visible and tagged against those keywords in Content explorer.
What is the best practice to review all this content to avoid false positives, we cant go each n every document or email to see how keywords r applied. Neither we nor client can review thousands of files to identify false positives n fine tune, was wondering how others are over coming this issue?
thanks in advance.
Fahad
Hi, FahadAhmed,
Ah yes. False positives. Always fun.
So, first thing I recommend making sure your sensitive information types are as specific as they can be. Use supporting evidence, character proximity, and additional checks however you can. Set the confidence level as high as you can. It's not a simple fix, but the more specific your custom SITs are, the surer the system will be it is a match and not something else.
As for maintaining and responding to false positives, well, it can get tiresome. However, the more time you can take on marking something as a false positive, the better the system will be as time goes on. You can get a little easier of a view by going to the custom SIT you created and selecting "Matched items" which will essentially open content explore for that SIT only, allowing you to work through the list. It will still be a lot of potential matched items, but the clean-up will pay off.
You also want to make sure you pilot your policies as much as possible. Don't enable them for the entire organization all at once. Spread out the enrollment to different groups and make sure each department is represented in the early waves so you can establish "policy champions" for each department that can help you work with the other users on their team.
- miller34mikeMicrosoft
Hi, FahadAhmed,
Ah yes. False positives. Always fun.
So, first thing I recommend making sure your sensitive information types are as specific as they can be. Use supporting evidence, character proximity, and additional checks however you can. Set the confidence level as high as you can. It's not a simple fix, but the more specific your custom SITs are, the surer the system will be it is a match and not something else.
As for maintaining and responding to false positives, well, it can get tiresome. However, the more time you can take on marking something as a false positive, the better the system will be as time goes on. You can get a little easier of a view by going to the custom SIT you created and selecting "Matched items" which will essentially open content explore for that SIT only, allowing you to work through the list. It will still be a lot of potential matched items, but the clean-up will pay off.
You also want to make sure you pilot your policies as much as possible. Don't enable them for the entire organization all at once. Spread out the enrollment to different groups and make sure each department is represented in the early waves so you can establish "policy champions" for each department that can help you work with the other users on their team.
- FahadAhmedBrass ContributorThanks Mike for the detailed response, that's what I had in mind as well, its great to have it validated.