This blog has been authored by Neta Haiby (Principal PM, Azure AI) and Prachi Jain (PMM, Azure AI)
Documents are prevalent and often contain vital information that are essential to drive business outcomes; however, extracting data quickly and accurately for processing is often a challenge for so many organizations. Manual extraction can take up long processing cycles, cause errors and inefficiencies. Hence, extracting text and structure from documents with Form Recognizer helps tackle these challenges swiftly and boost productivity.
We are excited to announce the general availability (GA) release of Form Recognizer. You can now extract text, tables, and key value pairs quickly and accurately from documents. It will support multi-page documents (Images, PDFs and Tiff files) and extract a structured representation output of the document and its contents.
Form Recognizer comprises the following:
1. Layout
Detects and extracts text and tables structure extraction:
How to use and get started
You can use Form Recognizer Layout to recognize tables, text lines and words in documents, without needing to train a model. To get started you can use the following:
- Extract the layout of a document, using the StartRecognizeContentFromUri method in the Form Recognizer client library.
- Follow this QuickStart to extract the layout of the a document using the REST API.
2. Pre-built
These are pre-trained models for common scenarios that extract value of interest from documents. The pre-built receipts model that extracts data from receipts is in general availability today.
How to use and get started
You can use Form Recognizer to extract common fields from receipts, using a pre-trained receipt model. To get started you can use the following:
- Extract data from receipts using the StartRecognizeReceiptsFromUri method in the Form Recognizer client library.
- Follow this QuickStart to extract data from receipts using the REST API.
3. Custom
This custom service lets you train with your own data to learn the structure of your documents in an intelligent way with unsupervised and supervised learning
How to use and get started
You can train custom models tailored to your own documents. A trained model can output structured data that includes the text, tables, and key value pairs relationships in the original form document. After you train the model, you can test it and eventually use it to reliably extract data from more forms according to your needs. To get started you can use the following:
- Train a custom model without labels and analyze your data using the method Custom Form Model in the Form Recognizer client library or using the REST API.
- Train a custom model with labels and analyze your data follow this QuickStart and try out Form Recognizer sample labeling tool located here.
Code examples
These code snippets show you how to do the following tasks with the Form Recognizer client library for .NET:
Extract text and tables from documents using Layout
Analyze forms with a custom model
How partners have built solutions with Form Recognizer
" Automation Anywhere has expanded the capabilities of IQ Bot to include Form Recognizer with an easy-to-use, “IQ Bot Forms” solution which combines the power of Microsoft Cognitive Services with IQ Bot and RPA to accelerate the end-to-end processing of complex documents. Use cases supported include Driver's Licenses, Insurance Claims and Tax Forms. This highly secure solution, comprises Automation Anywhere RPA with native Intelligent Document Processing (IDP) and Azure Cognitive Services Computer Vision API and Form Recognizer." Shobhana Viswanathan, Director of Business Development, Automation Anywhere
“Blue Prism has always been committed to creating a connected-RPA platform that makes it easy for our customers to consume the best in AI and machine learning technologies. As part of this commitment to innovation, we recently released a Form Recognizer API skill that gives our customers the power to quickly add deep-learning algorithms, advanced machine learning, and key value pair extraction to any Blue Prism process.” Colin Redbond - Senior Vice President – Emerging Technologies at Blue Prism
“Icertis’ suite of AI technologies use machine learning to help understand the contract, its obligations and its environment better. Form Recognizer is an important tool in that arsenal that helps identify structured data in forms quickly and accurately. With a very simple training interface, it empowers the Icertis Contract Management platform users to effectively incorporate AI in their day-to-day processes while ensuring that their data is safe and protected – important steps in our vision of making contracting simple, yet powerful.” Monish Darda, CTO and Co-founder, Icertis
“With the power of forms recognizer, Neudesic was able to create a simple interface for business users to extract data from multiple unique document sets, each containing complex data structures and dozens of data points, including tables. Users simply provide sample documents and label their data - no need to understand how Form Recognizer, or the other powerful Cognitive Services involved, should be applied, dramatically simplifying how AI can be applied to their processes.” Ken Kuzdas, Artificial Intelligence and Process Automation Lead, Neudesic
“UiPath remains committed to an open Platform and building integrations with partner AI services so you can automate document processing using your service of choice. The UiPath Activity Pack for Microsoft Azure Form Recognizer make it easy to automate tasks that involve document data – for example, reading invoices, timesheets, tables, and reports. By combining AI-powered document extraction services from Microsoft with the industry-leading UiPath Enterprise RPA Platform, you can extract data from any document using the service of your choice – and easily leverage this data in your automated processes.” Brandon Brown, Director Integrations and Solutions Delivery,UiPath
Independent benchmark testing results
Cazton,a top leader in IT and Software Consulting, Training and Recruiting services across United States, Canada and Europe performed an independent study comparing available cloud offerings for recognizing form data in the cloud and concluded that Azure Form Recognizer does a fantastic job in creating a viable solution with just five sample documents. It performs end-to-end Optical Character Recognition (OCR) on handwritten as well as digital documents with an amazing accuracy score and in just three seconds.
Chander Dhall, CEO of Cazton quotes that “I am impressed with Microsoft's focus on creating artificial intelligence powered solutions that have practical uses in the enterprise.”
New in this release
In this release we are introducing the following new features:
1. Enhanced security features
- Bring your own key
 Form Recognizer automatically encrypts your data when persisted it to the cloud. Form Recognizer encryption protects your data and to help you to meet your organizational security and compliance commitments. By default, your subscription uses Microsoft-managed encryption keys. However, you can now also manage your subscription with your own encryption keys. Customer-managed keys (CMK), also known as bring your own key (BYOK), offer greater flexibility to create, rotate, disable, and revoke access controls. You can also audit the encryption keys used to protect your data. Learn more here
- Private endpoints
 Enables you on a virtual network (VNet) to securely access data over a Private Link
2. Better Accuracy
- Table enhancements and Extraction enhancements
 This feature includes extraction enhancements, accuracy improvements and table extractions enhancements, specifically, the capability to learn tables headers and structures in custom train without labels.
- Currency support
 Helps detection and extraction of global currency symbols
3. Extended Availability
- Azure Gov
 Form Recognizer is available in 22 commercial regions and also in Azure Gov
Get started
- To get started create a Form Recognizer resource in the Azure Portal and follow one of our quick starts to extract data from your documents.
- To learn more about Form Recognizer and the rest of the Azure AI ecosystem, please visit our website and read the documentation.
- For additional questions please reach out to us at formrecog_contact@microsoft.com