Introduction
As the COVID-19 pandemic continues, more and more organizations and venues are mandating vaccinations for individuals who choose to gather in large groups. In this context, one of the largest commercial real-estate companies approached me asking if I can help them scan vaccination cards of individuals, using Azure AI services. This put me on a journey to explore Azure Form Recognizer, part of Azure Applied AI Services, which uses AI to identify and extract key-value pairs and layout information like tables and selection marks from documents. My goal was to train a model to scan and validate vaccination cards with just a few samples in less than an hour. Form Recognizer is a great tool for such Robotic Process Automation (RPA) use cases where we need to automate the reading of physical documents without writing a lot of code or extensive data science expertise. Below are the steps I followed to build such a system.
1. Setup Process
In my effort to build a vaccination card reader, my first order of business was to procure some training data, i.e. pictures of vaccination cards. Thanks to my colleagues, friends, and the internet I was able to collect a few different vaccination cards from different states and with different brands of vaccines. Just for good measure I also took multiple pictures of some cards in different angles (including upside down) to ensure I built a robust model. Once I had about ten pictures, I uploaded them into an Azure Blob Storage Container to label them and train the model.
Pro-tip: To ensure we test the model with pictures the model hasn’t seen during the training process, do not upload all your pictures into the blob storage location for labeling and training, but set aside a few pictures on your local disk for testing purposes.
My next task was to stand up a Form Recognizer service in Azure, and it was a pretty straightforward process as I followed the instructions given in the documentation. I chose the Free F0 Pricing tier as this was just a test. The rest of the settings other than name of the service, location and resource group were left as defaults.
Once the service was created, I saved the Endpoint and Key which enable me to connect to the service and train the model.
2. Labeling and Training
Now that I had the pictures and the AI service ready, I proceeded to label the pictures to show the system which portions of the card it needs to read and comprehend. For this I used the OCR (Optical Character Recognition) Form Labeling Tool, which is an open-source tool specifically built for this purpose available on GitHub. I found that the best way to use this tool locally on your windows machine is to install docker and run the docker container which hosts this application as described in the documentation. If you already have docker installed and running, you just need to run the below two commands and you can access the tool at the URL: http://localhost:3000.
docker pull mcr.microsoft.com/azure-cognitive-services/custom-form/labeltool:latest-2.1
docker run -it -p 3000:80 mcr.microsoft.com/azure-cognitive-services/custom-form/labeltool:latest-2.1 eula=accept
The labeling tool is also available at https://fott-2-1.azurewebsites.net/
Once I had the OCR tool up and running, I proceeded to create a project. To do so, I had to provide the following:
- The Form Recognizer service end point and key
- Create a connection to the Azure Blob Storage Container
While step 1 was easy using the endpoint and key, step 2 was more complex. First, I had to ensure the Resource Sharing (CORS) setting was correctly set on the blob storage account, then I had to ensure I generated the SAS URL at the container level and not at the storage account level.
Azure Blob Container Connection Settings were set as below
Once the connectivity to the Form Recognizer service and the Blob Storage container was established, the documents automatically showed up in the labeling area. The tool was also smart enough to create bounding boxes and highlighted all text in yellow. All I had to do was create a tag on the top right corner, click on the relevant text segment, and click on the tag again to assign that tag to the text segment. The tagging process was very simple and took less than five minutes to tag all the pictures.
Once I had 5 documents labeled, I clicked on the Train icon and a model was created instantly. Given that I only had a few images to train on, the training process was only a few seconds.
3. Testing and Deployment
Now it was time to test out my new model to see if it recognizes other vaccination cards. I had set aside a couple of images from the training process to validate the model. The OCR tool also helps with testing the model hosted on the Azure Form Recognizer service. All I had to do was click on the ‘Analyze’ menu item and upload an image from my local disk. The model performed amazingly well and was able to pick out the vaccine brand name even though certain parts of the card were hidden by a finger and the vaccine brand was handwritten instead of printed.
The model was spot-on so far, but I was curious to see if this model could identify a fake vaccination card, such as that of this recently reported case of a traveler to Hawaii? I put the model to the test, and the results were mixed. For one, this model was not able to pick up the CDC logo text and the vaccine brand name correctly as this card looks characteristically different from a standard vaccination card. In that sense, it was able to flag that this card looks different and needs to be examined further to confirm the veracity of the card.
Conclusion
The model performed incredibly well even though we built the model with less than ten images. One way to enhance the performance of the model would be train separate models for different kinds of vaccination cards (handwritten, printed stickers, non-USA etc.) and compose a hybrid model to handle all types of cards globally. Going back to the commercial real estate company that wanted to implement this solution: they tested out Azure Form Recognizer and found it to be better than an off the shelf RPA product they also tested. They are currently working on building a system to read vaccination cards of their employees to record their vaccination status. Overall, this was a quick and easy way to build a model to help organizations responsible for verifying vaccination status.