Internet of Things Blog

5 MIN READ

Bringing your Vision AI project at the edge to production is now insanely simple with VisionOnEdge

Former Employee

Apr 07, 2021

Customizing and deploying pre-existent Machine Learning (ML) models is one of the major challenges for ML customers. Our own manufacturing facility for Surface devices needed different models to detect display defects and ensure employee safety. While working with them, we learned that it is not possible for us to build models for all our customers’ current and future needs.

We collaborated with Custom Vision, Live Video Analytics (LVA), ONNX Runtime, and Azure IoT Edge teams to empower our manufacturing team to build these models themselves with no code or ML experience. We have named this solution template VisionOnEdge(VoE).

VoE empowers experts in different verticals (manufacturing, retail, etc.) to build and scale their models to existing camera feeds and locations. When the model at different locations does not meet performance benchmarks, experts are empowered to collect more data, retrain, and redeploy better models.

Simple deployment to a wide range of edge devices

VoE can be easily deployed to as wide range of edge devices, from Azure Stack Hub/HCI, to Azure Stack Edge, Azure Stack Mini R, or even your own device powered by Azure IoT Edge like the Intel AI devkit or the NVIDIA Jetson Nano.

Before deployment, you need to ensure the following:

Azure IoT Edge is running on your Linux-based edge device, or Windows-based devive using EFLOW (see more about this below)
You have Custom Vision credentials.

You can use one of three ways to deploy VoE on your edge device: shell.azure.com, Azure resource manager(ARM) template, or using Visual studio code.

Architecture

Here is a snapshot of a typical end-to-end Video Analytics solution leveraging VisionOnEdge

After you send the deployment, the IoTEdge agent running on your edge device gets new containers from the container registry and starts them on your edge device.

Web Module: This module is the web application that the user interacts with, for example when you add your camera this Web Module will set the graph with all your camera settings to the live video analytics module.
Live Video Analytics (LVA): This module will parse frames from all the cameras and send them to the Inference Module.
Inference Orchestrator: This module sends frames to Predict Module and gets results. It also overlays results on the camera feed and sends a HTTP video stream out to Web Module and sends ML results to the Azure IoT hub.
ML Predict Module: This module runs customvision trained models using onnxruntime, it takes frames over HTTP or gRPC and sends JSON results. In Bring Your Own Model flow, this is replaced by a user-provided module.
Web Module based on user setting can capture images automatically and send them to retrain using customvision.ai application protocol interface(API).
With LVA, users can also allow Web Module to store videos based on inference results and push them to their provided media service account on Azure.

Do awesome demos with VoE

Out of the box, you will find six scenarios that can help you get started and explore the possibilities of what you can build.

Move from Demo to Pilot on actual production hardware with the same project

As you move from demo to pilot stage, you may need to run ML on your own cameras, train models for your own objects, and deploy at scale on enterprise-grade devices. Here you can use the same VoE application to build zero to hero pilots on Azure Stack devices.

VoE makes it super easy to add your own existing cameras, collect images, label your object, and deploy. When you deploy a new task, VoE detects newly tagged images that are not part of the current ML model and automatically starts a new training job, exports the model to onnx format, and then runs it accelerated on your edge device using onnxruntime.

Taking images from a camera and manually tagging them could be a tedious task. To ease user pain, VoE captures and tags images automatically. Under advanced task setting, users can set VoE to collect a specified number of images for retraining. VoE will capture and store these images for users to validate auto-labeled images on the images tab. On the next deployment, VoE automatically triggers a training job with newly added images. This way VoE provides an easy way to improve model performance without the pain of manually capturing and tagging images.

Users may want to deploy open-source models, Microsoft cognitive service models, or their own models trained in Azure Machine Learning (AML). VoE allows users to bring their own models in three simple steps. Users first deploy their models as web applications (here is sample deployment using AML). Second, on the VoE model tab users can update the endpoint information and upload its labels.txt that contains information on what object the model can identify. Finally, on the deployment tab, they can see their own model option. Deploy an open-source Yolo model on VoE using the tutorial here.

Once users have a model that is working, they like to deploy this model on more than one camera. On VoE, users can do just that by simply choosing more than one camera during their deployment camera option. VoE uses LVA to parse frames from multiple cameras and auto-adjust frames rate per camera to ensure that each camera frame gets processed. Users can see results on each camera stream by toggling the deployment view to the camera of their choice. Users can also define their area of interest in the camera feed in case they only want a subsection of camera point of view to be used for processing.

In an advanced setting, users can enable sending messages to the IoT hub. They can set how many messages they want to send and when. Once the messages are received at the Azure IoT hub, users can use the IoT hub with logic apps to get notifications on Teams (more details here) and see analytics in time series insights (more details here).

VoE also enables users to store videos using LVA in the cloud, communicate between containers using shared memory by using gRPC, and allows you to disable the video output with results overlayed.

From Pilot to Production

With VoE, going from pilot to production is simple and smooth. VoE is already tested on all Azure Stack production scale devices like Azure Stack Edge and Azure Stack HCI. For bringing your pilot work to production you can choose any of the three paths below:

Computer Vision spatial analysis capabilities provide a production scale end-to-end experience. Please get in touch with us if you want Microsoft to help you go to production.
We have certified ISVs who have already used VoE code and taken customers to production like Linker Networks, Accenture, lumachain, uncanny vision, and Vulcanai. Please get in touch with our ISVs and they can help you get to production using Azure Stack devices.
All our code is open-sourced and shared with everyone to reuse using the MIT license. Look at our APIs and reuse our code to build your own production scale solution with Azure.

Next Steps

There is no need to clone or build any code. Just use this tutorial to deploy it on any Linux-based device(Azure stack Edge, Azure VM, Nvidia jetson nano, Intel ai dev kit) running IoT edge or on windows using WSL2 and eFlow, in 30 mins or less.
Watch these video tutorials to learn all the features on our web app and how to use them.
Check out our repo to get all our code and tweak it to your own needs.
Get in touch by filling this form

Updated Apr 07, 2021

Version 2.0

Former Employee

Joined October 15, 2019

View Profile