General availability of Azure Form Recognizer v3.0

Microsoft

Sep 14, 2022

Form Recognizer is an Applied AI service for all your document understanding needs. With Form Recognizer you can extract content from a variety of file formats with the Read API. Use document type agnostic models like Layout and General Document to extract content, fields and structure including tables, paragraphs and selection marks from any document. Prebuilt models targeting specific document types like invoices, W-2 forms or Health Insurance Cards extract a defined set of fields in addition to the structure. For a truly custom experience, train a custom model to extract the specific fields needed from any document type. Today, you can start using the new generally available API that enhances your document process automation or document understanding capabilities.

Over the course of the last year, starting with the preview release of the General Document API, we have continued to innovate and bring new capabilities that extend Form Recognizer capabilities in different dimensions. The new custom neural model announced in February 2022 added unstructured document analysis, and continued language expansion across the service. The last release in June 2022 included layout enhancements to support finer grain document decomposition into paragraphs and section. Each of these preview releases included several new features, updates to models, new languages and locales and AI quality improvements. We’ve listened to your feedback during the previews, tuning models, enhancing the API and SDK to improve the accuracy of the results and improve ease of use. With all these updates, the version 3.0 API is now generally available.

What’s new since the last version?

Enhanced user experience

With the Form Recognizer Studio, you can now explore all Form Recognizer capabilities with sample documents or validate results on your documents. The simple and flexible interface to create projects to label and train a custom model and model management functions like copying custom models provide you with all the tools to explore integrate Form Recognizer into your apps and workflows.

Form Recognizer Studio

Neural models for unstructured documents and structured forms

As growth in unstructured documents continues to outpace the growth in structured forms in most organizations, training models to extract fields from unstructured documents is critical to completing most scenarios. Custom neural models expand the key value extraction from structured forms to unstructured documents. Based on customer feedback, extracting repeating groups of information into tables was prioritized and tables in neural models are cross page by default. Neural models currently only support English language documents, get started with neural models for all English structured or unstructured documents.

Label and train a custom neural model

New and updated prebuilt models

There are several updates to prebuilt models including

New W-2 prebuilt model for US tax and income verification scenarios
New Health insurance card model to streamline patient check ins
Enhanced ID card service which now supports all US IDs and worldwide passports
Updates to prebuilt models to support additional document types and fields
Support of additional languages across prebuilt models

Prebuilt W-2 model

General document model for standard forms

The introduction of the general document model provides a pretrained, document type agnostic model to extract key value pairs or form fields, selection marks, tables and text from a variety of forms or documents. The model output includes key value pairs and extracted document structure like paragraphs, tables and selection marks. For most general forms, try the General Document model before deciding to train a custom model. Try the General document API in the Studio or integrate it in your application workflows with the REST API or the SDK.

General document results

Enhanced structure with layout

Layout API has been updated to now identify paragraphs and assign specific roles to paragraphs to improve semantic document segmentation. The supported roles are title, sectionHeading, footnote, pageHeader, pageFooter, and pageNumber.

Layout results

Other AI Quality improvements

The generally available August release has many AI quality improvements, here is a list of the top 10 most requested features that were added. For a complete list of updates, check out the what’s new page for the release notes form the last three preview releases.

Support for multi-page tables in custom models
Language and locale expansion across prebuilt and custom models
Additional fields on the US driver’s license
Data extraction support for US state ID, social security, and green cards. Support for passport visa information.
Locale expansion for invoices and receipts, now supporting French (fr-FR), Spanish (es-ES), Portuguese (pt-PT), Italian (it-IT) and German (de-DE).
Dense table extraction improvements
Address parsing for prebuilt models
Read API support for 164 different languages
Tables and text extraction enhancements for Layout
Language detection in Read API

With these updates Form Recognizer now enables you to build the document understanding or document process automation solution for most document types.