Dec 07 2017 12:00 PM - edited Apr 21 2018 11:24 AM
Need to propose a solution that will ingest scanned copies of Forms and extract entities and fields.
For e..g extract Invoice Number, Date, Customer Name, Product Names, Raw Material Names etc from a Invoice form.
There are many 3rd party companies which offer the ability to digitize your data. I was wondering what will be good Solution Architecture if we were to implement using Azure?
Should we consider Azure Computer Vision, Custom Vision, Entity Linking Intelligence Service API, Named Entity Extraction, LUIS, ML, etc? Any suggestions on how we can approach?
Mar 12 2019 02:11 AM
Hi,
Any outcomes on your research?
Thanks in advance!
Frank
Jun 12 2019 12:53 PM
@Vinay Bhatia I was wondering if you found anything on this issue. I also have a similar goal and am trying to use Computer Vision API from cognitive services.
Thank You.
Jun 12 2019 01:25 PM
@Deleted It's been more than a year since we created the solution. From what I can recollect,
we used an image classification to find whether scanned image is of type Form 1 or Form 2.
We then used Azure Computer Vision API to extract text within the image.
And then, we used a combination of LUIS and some RegEx String manipulation to extract Field Values.