Retail Self-checkout Object Detection Solution using Azure Percept

Copper Contributor

Jan 27, 2022

There is a high demand for self-checkouts in grocery stores. This is because it has several advantages, such as a faster process (and hence shorter waiting lines) or, in the case of a pandemic such as the COVID-19, safer iteration as fewer people need to touch the products. Computer vision can help with these tasks by automatically detecting objects and the number of items, especially in fruit detection, where the self-checkout kiosk would already have information on the fruits collected in the basket.

To implement the Retail Self-checkout Object Detection solution using Azure Percept, we can choose between a no code approach, an approach requiring some code (low code), and the option of customizing every small detail (pure code). This flexibility allows us to work on a vast range of projects and timeframes, i.e., supercharging POCs and MVPs; the platform incorporates scalability in its core, enabling us to push the ML system to any number of edge devices.

In order to explore the capabilities of Azure Percept, a team from Cognizant enrolled to the Microsoft Azure Percept Bootcamp. Using the knowledge from the bootcamp, we developed a Retail Self-checkout Object Detection solution, outlined below, and deployed it to Azure Percept DK. The solution and the approaches we used are detailed in this article.

Overview of the Solution

We implemented the Retail Self-checkout Object Detection Solution using Azure Percept using three different approaches: No Code, Low Code and Pure Code, on the same fruit detection use case. Each approach will iteratively require more customization and allow for more flexibility. We have outlined each approach in detail in separate sections below.

A brief description of the solution can be found in the following YouTube video:

The overall architecture for the solution is shown below. The solution features an integrated framework with Azure Key Services: Azure Percept Studio, Azure Machine Learning Studio, Azure Custom Vision, IoT Hub and IoT Edge.

General Pre-requisites – What is required to get started

Azure Percept DK: https://docs.microsoft.com/azure/azure-percept/overview-azure-percept-dk
Azure Percept Vision module: https://docs.microsoft.com/azure/azure-percept/azureeyemodule-overview
Microsoft Azure subscription (able to provision all required resources): https://azure.microsoft.com/free/
Set-up the Azure Percept DK

https://docs.microsoft.com/azure/azure-percept/quickstart-percept-dk-unboxing
https://docs.microsoft.com/azure/azure-percept/quickstart-percept-dk-set-up
Above two items should have resulted in provisioning an IoT Hub and accessing an Azure Percept Studio.

Fresh fruits (bananas, apples, oranges)

Solution Implementation

Retail Self-checkout Object Detection Solution: No Code Implementation

Summary

The first approach for deploying an object detection model using Azure Percept is without any kind of coding. This way, a user will acquire the dataset manually with Azure Percept Vision. Individually labeling each image, putting the user in direct control of what the model is going to be trained on, and the performance metrics of each model training iteration. Finally, we can manually choose the best performing model to be deployed to Azure Percept DK.

You will need to:

Capture data, with the custom vision service.
Label the data with the custom vision service.
Train an object detection model, with the custom vision service.
Publish the model and download the solution module to Azure Percept DK, with Azure Percept Studio.

Pre-requisites

“Custom Vision” resource – This you will create following below steps (same as MS tutorial)

Data

In this approach we use the Custom Vision service Azure is offering. This service allows us to connect a Custom Vision project to the Azure Percept Studio, and capture images with Azure Percept Vision, which automatically makes the images available on the Custom Vision service. After we have captured the necessary images of the fruits we are interested in,

either by single snapshots, or based on a timer, we create bounding box labels for these. (As above MS tutorial is very detailed in its steps, and will be updated as service gets updated, we will neglect specific steps here.)

After these steps, labelling our images of fresh fruits, we end up with something like presented in the bottom screenshot.

Model

This part using the Custom Vision service is pretty straight forward. Just like I love having a limited number of coffee options to make it easier to pick one in the end, Microsoft, maybe not by choice, presents a limited number of model options to pick form. Basic idea behind blow options is probably to give quick starters and fairly general transfer learning options.

You can read more on the domains and model footprint here https://docs.microsoft.com/azure/cognitive-services/custom-vision-service/select-domain.

Naturally we chose the object detection project, and for the domain, we picked General (compact). As this needs to be pushed to Azure Percept DK and give real-time inferencing.

Training and testing

This section is even easier than the model decision step. We only have to hit the green train button, and decide for a quick training option, or an “advanced” – advanced being for how many hours we want to train our model.

No need to get familiarized with numerous hyperparameters and model optimization options. Start the training, and come back after the coffee break, and voila. A finished trained model, on your use case.

You can run the training multiple times and a new iteration tag will show up in your list, these indicate a new model. You can toggle the threshold values, and your model’s performance shows based on your training data, not recommended as final model KPIs, use a separate test set.

Model deployment

To finish off the No Code approach, we simply need to follow the step wise instructions on highlighting what model we want deployed and to what device (Azure Percept DK). And after a few point-and-click steps, you can see the video stream of your device, with your custom labels and objects of interest.

Final comments

The No Code capability of Azure Percept is really easy to learn and follow along with. It will help you get an MVP IoT solution with ease!

The following image is captured from the video feed of Azure Percept Vision.