Introduction to Service-side Auto-labeling: Benefits and Purpose

Microsoft

Jan 20, 2022

The intent of this blog series is to help customers refresh their understanding of service side auto-labeling with hero scenarios explained. In addition to the Playbook we recently released, this series will help guide you in setting up auto-labeling policies and take action to enforce those policies in simulation mode. We will add new content to this blog every few weeks to cover a range of topics related to auto-labeling.

How can auto-labeling help your organization?

Microsoft Information Protection (MIP) provides a unified set of capabilities to know your data, protect your data, and protect against data loss across Microsoft 365 apps and services. Foundational to Microsoft are its classification capabilities—from out-of-the-box sensitive information types to machine learning trainable classifiers to automatically finding and classifying sensitive content at scale. MIP’s auto-labeling capability helps customers to quickly classify more of their ever-increasing data and protect sensitive content.

Sensitivity labels are at their basic level a tag, that is customizable, persistent, accessible to applications, and visible to users. Labels once applied to documents and email become the basis for enforcing data protection policies throughout the tenants’ digital estate. When a label is applied to a file or email it is persisted as document metadata. When a label is applied to a SharePoint site or OneDrive for business the label persists as container metadata.

With auto-labeling policies, administrators can automatically apply sensitivity labels to email messages, OneDrive files, and SharePoint files that contain sensitive information. This labeling is applied by services rather than applications, so you don’t need to worry about what type of client the user is using. This label will be automatically applied to content that matches the rules and related conditions here. Auto-labeling also places labels on emails sent to users for whom the policy applies.

When to use service-side auto-labeling?

There are two different methods for automatically applying a sensitivity label to content in Microsoft 365 – Client-side labeling and Service-side labeling. For the purposes of this blog, we’re focusing on Service-side auto-labeling.

Service-side auto-labeling is sometimes referred to as auto-labeling for data at rest and data in transit. Unlike client-side auto-labeling, service-side auto-labeling does not depend on the client to analyze the document content while it is being created. Instead, service-side auto-labeling reviews content that is stored (at-rest) in SharePoint or OneDrive document libraries, or that is "in-flight" or being sent within Exchange. Because this labeling is applied by services rather than by applications, you don't need to worry about what apps users have and what version. As a result, this capability is immediately available throughout your organization and is suitable for labeling at scale. Auto-labeling policies don't support recommended labeling because the user doesn't interact with the labeling process. Instead, the administrator runs the policies in simulation mode to help ensure the correct labeling of content before applying the label.

This ability to apply sensitivity labels to content automatically is important because:

You don't need to train your users when to use each of your classifications.
You don't need to rely on users to classify all content correctly.
Users no longer need to know about your policies—they can instead focus on their work.

What's new in auto-labeling?

Introducing our new admin feedback feature in auto-labeling

Our admin feedback feature that was made Generally Available in January 2022 now allows admins an inside view of the labeling progress of their auto-labeling policies.

After your auto-labeling policy is turned on, you can view the labeling progress for files in your chosen SharePoint and OneDrive locations. Emails are not included in the labeling progress because they are automatically labeled as they are sent.

The labeling progress includes the files to be labeled by the policy, the files labeled in the last 7 days, and the total files labeled. Because of the maximum of labeling 25,000 files a day, this information provides you with visibility into the current labeling progress for your policy and how many files are still to be labeled.

When you first turn on your policy, you will initially see a value of 0 for files to be labeled until the latest data is retrieved. This progress information updates every 48 hours, so you can expect to see the most current data every other day. We are working on reducing this SLA. When you select an auto-labeling policy, you can see more details about the policy in a flyout pane, which includes the labeling progress by the top 10 sites. The information on this flyout pane might be more current than the aggregated policy information displayed on the Auto-labeling main page.

You can also see the results of your auto-labeling policy by using content explorer when you have the appropriate permissions:

Content Explorer List Viewer role group lets you see a file's label but not the file's contents.
Content Explorer Content Viewer role group, and Information Protection and Information Protection Investigators role groups (currently in preview) let you see the file's contents.

Tip: You can also use content explorer to identify locations that have documents with sensitive information but are unlabeled. Using this information, consider adding these locations to your auto-labeling policy, and include the identified sensitive information types as rules.

How to get started with Auto-labeling?

Easy Trials

One of the feedback we often hear from customers is where to start? To that effect, we've designed an easy setup process to help you leverage and get started with our E5 capabilities in one click. Our default policies help protect credit card information in your tenant through the default MIP labels based on Microsoft Industry recommendations, auto-labeling policies, and data loss prevention protecting devices and Teams messages. E5 customers or Compliance E5 Trial customers can interact with the Banner module in the Information Protection's Overview tab. The recommended set of features on the banner is based on the existing setup. We help you set up features you haven’t dabbled with yet. Following is an example of the banner.

Learn about the default labels and policies for Microsoft Information Protection - Microsoft 365 Compliance | Microsoft Docs

If you’re not an E5 customer, you can sign up for our Compliance trial to gain access to our default policies or leverage our default MIP label schema by manually creating the labels following our documentation above. Learn about the Microsoft 365 compliance trial - Microsoft 365 Compliance | Microsoft Docs

Hero Scenarios

To help customers classify their sensitive data accurately, we have several out-of-the-box policy templates tailored to different industry and geographical regulations instead of creating custom policies from scratch.

Some of our most popular templates include:

GDPR Enhanced
US Financial Data
US HIPPA Enhanced

Our enhanced policy templates extend several of the original templates by also detecting named entities (such as full names and physical addresses). Just look for the templates labeled 'Enhanced' to start protecting even more personal data.

Recent additions to auto-labeling allow for the inclusion of advanced classifies like named entities. Combine named entities (All Full Names and All Physical Addresses) to your custom policies to reduce false positives and detect identifiable sensitive information. Learn about named entities (preview) - Microsoft 365 Compliance | Microsoft Docs
Many of our healthcare customers leverage Exact Data Match to specifically define and detect patient/customer data to auto label. Learn more about Implementing Microsoft Exact Data Match (EDM) Part 1 - Microsoft Tech Community
For more information on some of the specific scenarios please refer to the auto-labeling use cases section in the service side auto-labeling playbook.

Blog Post