What's new in Form Recognizer: Semantic document segmentation, cross page tables and lots more

Microsoft

Jun 07, 2022

Form Recognizer is an Applied AI service with prebuilt and custom models for your document understanding scenarios. Today, customers can take advantage of a new set of preview capabilities that enhance your document process automation or document understanding capabilities. This release is packed with new features and updates.

What’s New in Form Recognizer

Layout extracts paragraphs and roles

The layout API extracts text and structure from documents. Starting this month, in addition to text, tables, and selection marks, the layout API extracts text blocks as paragraphs. This capability is primarily meant for unstructured documents. Layout also identifies and assigns specific roles to paragraphs that help in semantic understanding of the documents. The supported roles are title, sectionHeading, footnote, pageHeader, pageFooter, and pageNumber.

Cross-page tables in custom models

Tabular fields enable capturing repeating structure in a document into tables. Starting with the June 2022 API version, custom neural models will add support tabular fields.

With this release the tabular field extraction in custom template models is also updated to extract tables that span across multiple pages. Tables spanning multiple pages has been a frequent ask and the custom neural models will also support cross page tables by default. If you have a dataset labeled with tables, train a model with the current API to start seeing multi-page tables in the response.

Invoice model language expansion

The invoice model now supports additional languages! In addition to English and Spanish, the languages now supported include German (de-DE), French (fr-FR), Italian (it-IT), Portuguese (pt-PT) and Dutch (nl-NL). This opens up the procurement scenarios to invoices in many different languages.

For times when you need a few additional fields not in the invoice schema, the invoice output now includes the key value pairs output from the General Document model. This is a pre-trained model to extract key value pairs from documents and will include all identified key value pairs. The additional output is included with no pricing changes.

Read support for Microsoft Office and HTML documents

The Read API now supports extracting text from Microsoft Word, Excel, PowerPoint, and HTML documents including text from any embedded images in Microsoft Office files. This enables customers to centralize their document ingestion with Form Recognizer and have fewer integrations to maintain over time.

Business card model updates

Form Recognizer models continue extend languages supported. Japanese is a newly supported language for the Business cards model. Try the updated business card model in the Form Recognizer Studio.

ID model updates

The IDs prebuilt model now extracts DateOfIssue, Height, Weight, EyeColor, HairColor and DocumentDiscriminator from US driver’s licenses. Try the updated business card model in the Form Recognizer Studio.

A few examples on how customers are generating value with Form Recognizer

Air Canada was tasked with verifying the COVID-19 vaccination status of thousands of worldwide employees in only two months. See how Air Canada was able to design and build a solution in weeks using Form Recognizer, meeting the government mandate on time and saving thousands of hours of manual work.

Fujitsu, the world leader in document scanning technology, with more than 50 percent of global market share uses Form Recognizer to improve the performance and accuracy of its cloud scanning solution. In only a few months they have boosted character recognition rates as high as 99.9 percent. Learn more about how Fujitsu delivers market-leading innovation and give its customers powerful and flexible tools for end-to-end document management.