Shopping malls are facing strong competition and effective loyalty programs boost customer retention. The primary goal of the loyalty scheme is to promote loyalty at the mall, increase footfall whilst understanding shopping habits. With large number of stores and various receipt formats in a mall, the process of manual checking and verification of the data submitted in place did enable rewards to be issued, but proved slow, expensive, inconsistent, and non-scalable. It did not include the valuable line item/product information the mall needed to understand the shopping habits. Therefore, one of the largest shopping malls used Azure Form Recognizer automating receipt scanning and data extraction and feeding the data as rewards points into the customer’s loyalty program, which greatly improved customer shopping experience.
In this project, the key fields that need to be extracted to feed into reward systems are:
Given there are more than 1000 mixed language receipt formats and no clear identifiable patterns on where retailers would place line items on receipts, and some customized fields need to be extracted, a combination of Form Recognizer Receipts prebuilt and Custom Form are used to achieved 90%+ overall accuracy.
Below solution architecture is built to meet customer requirements
The solution follows a serverless PAAS architecture using Azure Functions.
For Ingestion, the trigger function is used to pull the data into Azure Blob.
For each store, the master details such as store name, unit number etc. are stored as additional metadata in a table storage. All the trained model IDs are tagged to each store in the table storage again.
For each receipt format, Form Recognizer Sample Labeling Tool is used to prepare the labeled data for training Custom Form models. Form Recognizer is called from the Azure Functions because once the data has been extracted with the help of some business functions, the formatted data is provided to the client.
Azure Event Grid also plays a vital role in the architecture, since the functions has a timeout of 10 minutes, each sub sections are segregated into event-triggered functions and run as batch process. The prebuilt Receipts model and Custom Form model are called to generate the final output.
Event Grid has an auto failover mechanism in the event of any failures within the function.
Azure Storage (Blob/Table) are used for storing training data, store metadata, request response logs and custom model’s information.
How the solution works
Step 1-2: The client (mobile) application sends a request to an Http Function and function gets all data from body and publishes it as an event and returns an acknowledgement message.
Step 3-5: The prebuilt Receipts API is called from a grid-triggered function to extract some information like Time, Date etc.… Fields like Store Name, Total Amount and Receipt Number has been extracted with Custom models. To identify which model to call for a receipt, there’s a metadata of synonyms that uniquely identify the store from others.
Step 6: Once the Form Recognizer returns the fields, Business logic is used to clean and format the data. Then the data is returned to their Webhook.
Step 7: There are timer triggers in the flow which collects data like the Active/Inactive stores at customer side.
Below pre-processing techniques were employed to clean the incoming receipts.
One custom cleaning to reduce it to 5 inches (so the processing can be standardized, which helped)
A filter to exclude any files larger than 50 MB, this was a requirement from Form Recognizer on its own.
Date format conversion was a key logic to increase the date field accuracy into a common format from different templates. A mapping table is maintained to understand the style of the formats for each store as to some stores represent it in dd/mm/yyyy or mm/dd/yyyy format.
Amount Data Clean-up - Removing currency, decimal, Alphanumeric character suffix -prefix, comparing with Prebuilt model output if custom is empty, Number of characters in total amount(exceeds), regex for pattern detection and recognition, etc.
A 2-layer approach is employed to identify the correct model. The output from Prebuilt Receipts in the step 1 is used for co-relating the store name present in the master synonym database.
Synonym was a key aspect of making sure that the correct model is identified and then the relevant store model is called to extract the final field sets. If Synonym approach fails, the plain OCR text for a keyword match will be used as step 2.
With 5000+ receipts analyzed daily from 1200+ stores, the solution can achieve an overall accuracy of 90%+ daily. The accuracy can be iteratively improved by addressing edge cases in production and upgraded to Form Recognizer v3.0.