The Azure Information Protection Scanner is a program designed to detect, classify, and optionally protecting documents stored on File Shares and On-Premises SharePoint servers. The overview below is from the official documentation at https://docs.microsoft.com/en-us/information-protection/deploy-use/deploy-aip-scanner. This blog post is meant to assist customers and partners with deployment of the AIP Scanner. If there is ever a conflict between this blog and the official documentation, the official documentation is authoritative.
The AIP Scanner runs as a service on Windows Server and lets you discover, classify, and protect files on the following data stores:
*SharePoint 2010 support is only available with a valid extended support contract for that product.
The scanner can inspect any files that Windows can index, by using iFilters that are installed on the computer. Then, to determine if the files need labeling, the scanner uses the Office 365 built-in data loss prevention (DLP) sensitivity information types and pattern detection, or Office 365 regex patterns. Because the scanner uses the Azure Information Protection client, it can classify and protect the same file types.
With an AIP Premium P1 license, you can run the scanner in discovery mode and use the information obtained in the reports to make more informed decisions based on your exposure to risk.
NOTE: The scanner does not discover and label in real time. It systematically crawls through files on data stores that you specify, and you can configure this cycle to run once, or repeatedly.
This blog post was written based on the public preview version of the AIP Scanner (18.104.22.168). Every effort will be made to update it when things change, but if you run into difficulty running any of the commands on a newer version, please use the official documentation to identify any changes.
***NOTE: This post shows only the features possible with an Azure Information Protection P1 license and does not cover the additional classification and protection features of the AIP scanner. If you have AIP Premium P2, please review the blog here for full details***
To install the AIP Scanner in a production environment, the following items are needed:
NOTE: We have scripted the scanner installation process and it is now available at https://techcommunity.microsoft.com/t5/Azure-Information-Protection/Azure-Information-Protection-Sca.... Although these steps are still valid, the scripted method is far less prone to mistakes and much faster for deployment.
A basic installation of the AIP Scanner service is simple and straightforward.
After the install of the AIP Scanner binaries, you must authenticate with the AIP Scanner Service Account to get a token for use in automated discovery, classification, and protection.
Repositories can be on-premises SharePoint 2010, 2013, or 2016 document libraries or lists and any accessible CIFS based share.
NOTE: In order to do discovery, the scanner service pulls the documents to the server, so having the scanner server located in the same LAN as your repositories is recommended. You can deploy as many servers as you like in your domain, so putting one at each major site is probably a good idea.
One of the most useful features of the AIP Scanner is the discovery of sensitive data across all of your configured repositories. You can do this by using Set-AIPScannerConfiguration with a switch called -DiscoverInformationTypes. When this switch is set to All, the scanner will discover files that contain any data in the list of all Office 365 DLP sensitive data types so configuration of conditions in labels are not required.
NOTE: Normally, custom data types based on string and regex values are also available, but these require AIP Premium P2 licensing.
The PowerShell command below will allow you to scan your repositories against all information types.
Set-AIPScannerConfiguration -DiscoverInformationTypes All
To start the discovery, use the PowerShell command below
After running the scan, you can review the logs by opening the Event Viewer and clicking on
you can view the detailed logs at C:\users\<Scanner Service Account Profile>\appdata\local\Microsoft\MSIP\Scanner\Reports. There you will find the summary txt and detailed csv files.
Below is a screenshot showing the DetailedReport.csv file after a full discovery scan.
As you can see, it shows the file name and all of the sensitive information types that were identified in each file. This data can be reviewed manually, or more realistically, ingested into a SIEM for analysis and reporting.
Although automated protection via the AIP scanner is more convenient, there are still options available with AIP P1 for classifying and/or protecting the sensitive data. These options are
You should now have a fully functional AIP Scanner instance. You can repeat this process on multiple servers as necessary and use the same Set-AIPAuthentication command for each of them. This is a simple setup for a basic AIP scanner server that can be used to discover a large amount of sensitive data easily. I highly recommend reading the official documentation on deploying the scanner as there are some less common caveats that I have left out and they cover performance tips and other additional information.
The Information Protection Customer Experience Team
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.