Azure AI Document Intelligence new capabilities including classification are now generally available

Microsoft

Aug 02, 2023

Azure AI Document Intelligence formerly known as Form Recognizer now has a new set of capabilities generally available! Documents are core to any business process, from Intelligent Document Processing (IDP) solutions like invoice processing to knowledge extraction like tax filing, financial reporting and audits. Azure AI Document Intelligence has AI components for you to build your document processing workflows.

The v3.1 release of Document Intelligence brings new features and updated capabilities. Try using any of these updates in the Studio or the REST API version 2023-07-31.

Document Classification and Splitting

Processing files with multiple documents or routing user uploads to the right model for extraction is now much simpler with the new document classification and splitting model. Train a classifier in a couple of minutes to classify the different document types you need to process. The classification and splitting API will identify the individual instances of the document types in the input file allowing you to now easily process common scenarios -

A single file containing multiple documents - Loan origination and other workflows require you to process several documents scanned into a single file.
Single file containing multiple instances of a document - Solutions like invoice processing require you to process multiple invoices in a single file.
User uploads - Your users may upload one of many document types needed, where your application needs to identify the document type to determine how to process each document.

Classification models are easy to train in the Document Intelligence Studio. Get started today with training and testing a classification model in less than 5 minutes.

New and updated prebuilt models

Prebuilt models offer a simple API to extract a defined schema from known document types. The v3.1 release adds new prebuilt models for contract processing and prebuilt models for processing variants of the 1098 tax form for US tax processing scenarios.

Contract prebuilt model

The contract model extracts a predefined schema for contract metadata extraction. The contract prebuilt is ideal for Contract Lifecycle Management (CLM) scenarios where you are looking to extract common metadata from contracts including the parties and dates. Try the contract prebuilt in the Document Intelligence Studio or the REST API today.

US Tax 1098 prebuilt models

Tax processing in the US is now significantly easier with the addition of the 1098 models to the existing W-2 prebuilt model. The 1098 models support a few different variations, the 1098, 1098T and 1098-E and 1098-T. Like all other prebuilt models the 1098 models extract a defined schema.

Try the new 1098 models in the Document Intelligence Studio

Updates to prebuilt models

In addition to AI quality improvements across all prebuilt models, language expansion, invoices now support 20 new currency codes.

Add-on features

Fonts

Detecting and identifying fonts enables recreating the document with higher fidelity based on the Document Intelligence response, this also enables extensions to the semantic segmentation of documents that started with paragraphs and paragraph roles that were introduced in the previous release.

High resolution images

Some documents like engineering drawings require a higher resolution input to accurately extract text and identify features. With the added support for high resolution images, these documents are now supported in a high-resolution format.

Barcodes

Documents like healthcare forms can contain more than just printed or handwritten text and can include objects like barcodes for patient or drug identification.

Documents containing barcodes can now be processed more effectively with barcodes being supported by all APIs as an optional feature. The different types of barcodes recognized are:

QR Code
Code 39
Code 93
Code 128
UPC (UPC-A & UPC-E)
PDF417
EAN-8
EAN-13
Codabar
Databar
Databar Expanded
ITF
Data Matrix

Try the new barcode extraction in the Document Intelligence Studio by turning on the barcodes option in the analyze options settings.

Custom models

- Custom neural models provide the flexibility of handling various document templates, the languages supported by custom neural models now expand significantly making neural models the default option to train custom models for all your documents.
- This release also makes training custom models easier by lowering the number of labeled documents needed to train a custom neural model to a single document! Making it easy to get started with training a custom neural model. Training of custom neural models is expanding to 10 additional regions in August, custom neural models should now support most of your custom model needs. See the what's new page for the updated list of regions and languages.
- Custom template models also continue to be enhanced with an improved signature detection capability.
Follow the quick start to build a custom model in the Studio today for any form or document you process.

Document Intelligence Studio

Test any of the Document Intelligence models or features in the Studio, from testing a prebuilt model and validating the results to training and testing custom models. Updates to the Studio include all the new and updated models and productivity features to train custom models.

Human In the Loop

The new Human In The Loop (HITL) feature in the Studio builds on the pre-label capability to create an intuitive experience to rapidly improve custom model accuracy. You can now use the test page for your custom model to selectively add documents that the model produces low confidence or erroneous results. This ensure that the model continues to improve with documents that are specifically different or challenging. The pre-labeling speeds up the labeling process so you only update the fields where the model prediction needs to be improved.

Productivity improvements for custom models

Pre-label: Generate additional labeled data by using the output from an existing custom model or prebuilt model to label your data. Pre-labeling speeds up the labeling process where you only need to focus on fixing the specific fields the model needs to improve on.

Auto-label: Tables can be cumbersome to label, with the auto-labeling feature, use the table structure identified by layout to label the table to speed up your labeling workflow.

Dataset management: Improvements to the custom model dataset management capabilities to find specific documents, sort and filter documents and identify documents that need labeling updates.

Try the new HITL capability in the Studio while building your next custom model.

Language expansion

The new 3.1 API expands languages covered across most models. The Read, Layout and custom template APIs now supports 145 additional languages including commonly requested languages like Thai, Tamil and Hebrew bringing the total number of languages supported to 309.

Prebuilt models expanded language coverage for the prebuilt receipt model to over 100 languages and the prebuilt invoice model to 29 languages.

Custom neural models now support a wide variety of languages enabling customers to train neural models for most scenarios.

Updates to the model outputs

As Document Intelligence continue to grow, the APIs are now streamlined, and you can choose to add the specific feature you need to your document workflows. Need barcodes extracted with custom models? Just add the features=barcodes query string to your request and you have barcodes extracted along with your custom model results.

Document Intelligence continues to simplify document processing workflows and enables you to automate challenging document centric tasks.