Remote work has accelerated the growth in volume and variety of data in organizations. Microsoft provides you intelligent solutions to meet the resulting challenges in governing this data.
Foundational to these solutions are capabilities to detect and classify data accurately. Only after you accurately classify your data can you start to govern your data by deciding what to protect, what to retain or delete. At Microsoft, we know that the future of accurate data classification at scale requires machine learning. Through our trainable classifiers, you can leverage the power of machine learning to identify more categories of data with increased accuracy. These classifiers use natural language processing and statistical algorithms to identify critical information.
You can deploy machine learning models with ease through our built-in classifiers, which have been trained at Microsoft and are ready to use in the Microsoft 365 compliance center. Built-in classifiers are readily available for your use to detect and classify popular data categories, for example resumes and source code. Your organization will also have unique data that you can classify by creating a custom trainable classifier. Our customers across a range of industries are taking advantage of this unique opportunity to easily build and deploy trainable classifiers without needing any expertise in machine learning. For instance, a custom classifier can be built to classify loan contracts, invoices, and project documents. Together, both built-in and build-your-own trainable classifiers provide classification support for a breadth of categories important to your enterprise.
Today we are excited to announce the general availability of machine learning based trainable classifiers. This GA includes two new features to improve the accuracy of trainable classifiers. Built-in classifiers are available now in English, with support for Spanish, Japanese, French, German, Portuguese, Italian, and Chinese (simplified) coming in the second half of 2021.
Figure: List of ‘built-in’ and ‘build-your-own’ trainable classifiers in the Microsoft 365 compliance center
Figure: Example of a custom classifier built for detecting contract documents
Use trainable classifiers to automatically apply retention policies.
Many organizations rely on employee judgment and manual classification when it comes to managing records and retention schedules. This method is prone to errors and inaccuracies. Additionally, most organizations have unmanaged data repositories that need governance but don’t have a way to classify data at scale.
With trainable classifiers, you can apply retention schedules and records policies at scale for business-critical information. For example, a compliance administrator and a records manager can work together to train a new classifier to recognize procurement documents and auto-apply a retention policy.
Microsoft’s legal team is one of many customers who use trainable classifiers to manage records in place and at scale.
“My hands-on experience in creating a trainable classifier demonstrated how automatic detection and classification of critical records help in accurately executing in-place records management across a large enterprise.”
-Jorge Garcia, Business Operations Associate, Corporate, External, and Legal Affairs at Microsoft
In Content Explorer, which is your primary tool for classified and labeled data discovery, you will see documents and emails that are a match for trainable classifiers. We are now offering you two new features to improve the accuracy of both built-in and build-your-own trainable classifiers. You can now evaluate the matched documents and provide feedback that will retrain the classifier and improve its accuracy. You can also view analytics on the degree of accuracy improvement to decide when to republish your classifier.
Figure: Ability to view matched documents for built-in ‘resume’ classifier and provide feedback to improve classifier accuracy
Figure: Analytics around accuracy improvement for the ‘resume’ classifier post-feedback
Machine learning based trainable classifiers are a powerful capability that enable you to detect and classify data unique to your organization at enterprise scale. We will continue to innovate and bring you new value here. Using trainable classifiers to automatically apply data protection policies in Microsoft 365 applications like Word, Excel, PowerPoint will be generally available in the first half of 2021.