Solution for OCR Forms Processing and Entity Extraction

Vinay Bhatia · ‎Dec 07 2017

Need to propose a solution that will ingest scanned copies of Forms and extract entities and fields.

For e..g extract Invoice Number, Date, Customer Name, Product Names, Raw Material Names etc from a Invoice form.

There are many 3rd party companies which offer the ability to digitize your data. I was wondering what will be good Solution Architecture if we were to implement using Azure?

Should we consider Azure Computer Vision, Custom Vision, Entity Linking Intelligence Service API, Named Entity Extraction, LUIS, ML, etc? Any suggestions on how we can approach?

Goofr · ‎Mar 12 2019

Hi,

Any outcomes on your research?

Thanks in advance!

Frank

Vinay Bhatia · ‎Jun 12 2019

@Vinay Bhatia I was wondering if you found anything on this issue. I also have a similar goal and am trying to use Computer Vision API from cognitive services.

Thank You.

VinayB · ‎Jun 12 2019

@Deleted It's been more than a year since we created the solution. From what I can recollect,
we used an image classification to find whether scanned image is of type Form 1 or Form 2.
We then used Azure Computer Vision API to extract text within the image.
And then, we used a combination of LUIS and some RegEx String manipulation to extract Field Values.

Solution for OCR Forms Processing and Entity Extraction

Solution for OCR Forms Processing and Entity Extraction

Re: Solution for OCR Forms Processing and Entity Extraction

Re: Solution for OCR Forms Processing and Entity Extraction

Re: Solution for OCR Forms Processing and Entity Extraction

Products (50)

Special Topics (27)

Video Hub (462)

Most Active Hubs

Most Active Hubs

Video Hub

Solution for OCR Forms Processing and Entity Extraction

Solution for OCR Forms Processing and Entity Extraction

Re: Solution for OCR Forms Processing and Entity Extraction

Re: Solution for OCR Forms Processing and Entity Extraction

Re: Solution for OCR Forms Processing and Entity Extraction