Santa Claus has been kidnapped!
The Christmas elves have called upon you to save Santa Claus by developing an intelligent AI-app. You will build an object detection system that detects Santa Claus in images taken from live cameras mounted all over the Christmas village.
Hi, I am Foteini Savvidou, a Beta Microsoft Learn Student Ambassador!
I am an undergraduate Electrical and Computer Engineering student at Aristotle University of Thessaloniki (Greece) interested in AI, cloud technologies and biomedical engineering. Always passionate about teaching and learning new things, I love helping people expand their technical skills through organizing workshops and sharing articles on my blog. My goal is to use technology to promote accessibility, digital and social inclusion.
Azure Custom Vision is an Azure Cognitive Services service that lets you build and deploy your own image classification and object detection models. Image classification models apply labels to an image, while object detection models return the bounding box coordinates in the image where the applied labels can be found.
Do you want to learn more about Azure Custom Vision? You can read my previous articles about creating a Custom Vision model for flower classification and an object detection model for grocery checkout.
In this article, we will build and deploy a festive object detection model to help the Christmas elves find Santa Claus. You will learn how to:
To complete the exercise, you will need an Azure subscription. If you don’t have one, you can sign up for an Azure free account. If you are a student, you can apply for an Azure for Students subscription.
To build and train our machine learning model, I created an image dataset consisting of 50 images of Santa Claus. You can download the dataset from my GitHub repository.
To use the Custom Vision service, you can either create a Custom Vision resource or a Cognitive Services resource. If you plan to use Custom Vision along with other cognitive services, you can create a Cognitive Services resource.
In this exercise, you will create a Custom Vision resource.
You can build and train your model by using the web portal or the Custom Vision SDKs and your preferred programming language. In this article, I will show you how to build an object detection model using the Custom Vision web portal.
Let’s test the model and see how it performs on new data. We will use the images in the Test folder you extracted previously.
The Smart Labeler enables you to quickly tag a large number of images. The service uses the latest iteration of the trained model to predict the label of the untagged images. You can then confirm or decline the suggested tag.
You can learn more about the Smart Labeler at the Custom Vision Service Documentation.
You can add more images in your model to improve the performance metrics. Learn more about how to improve your object detection model at the Custom Vision Service Documentation.
Before publishing our model, let’s test it and see how it performs on new data.
Once your model is performing at a satisfactory level, you can deploy it.
In the Custom Vision portal, click the settings icon (⚙) at the top toolbar to view the project settings. Then, under General, copy the Project ID.
Navigate to the Custom Vision portal homepage and select the settings icon (⚙) at the top right. Expand your prediction resource and save the Key and the Endpoint.
To create an object detection app with Custom Vision for Python, you'll need to install the Custom Vision client library. Install the Azure Cognitive Services Custom Vision SDK for Python package with pip
:
pip install azure-cognitiveservices-vision-customvision
Then, create a new Python script (test.py
) and open it in Visual Studio Code or in your preferred editor.
Want to view the whole Python script at once? You can find it on GitHub.
from azure.cognitiveservices.vision.customvision.prediction import CustomVisionPredictionClient
from msrest.authentication import ApiKeyCredentials
from PIL import Image, ImageDraw, ImageFont
import numpy as np
import os
<YOUR_PROJECT_ID>
, <YOUR_KEY>
and <YOUR_ENDPOINT>
with the ID of your project, the key and the endpoint of your prediction resource, respectively.# Create variables for your project
publish_iteration_name = "Iteration4"
project_id = "<YOUR_PROJECT_ID>"
# Create variables for your prediction resource
prediction_key = "<YOUR_KEY>"
endpoint = "<YOUR_ENDPOINT>"
prediction_credentials = ApiKeyCredentials(in_headers={"Prediction-key": prediction_key})
predictor = CustomVisionPredictionClient(endpoint, prediction_credentials)
# Detect objects in the test image
img_file = os.path.join('Images', 'Test', 'SantaClaus (1).jpg')
with open(img_file, mode="rb") as test_img:
results = predictor.detect_image(project_id, publish_iteration_name, test_img)
# Load a test image and get its dimensions
img = Image.open(img_file)
img_height, img_width, img_ch = np.array(img).shape
# Display the image
draw = ImageDraw.Draw(img)
# Select line width and color for the bounding box
lineWidth = int(img_width/100)
color = (0,255,0)
# Display the results
for prediction in results.predictions:
if prediction.probability > 0.5:
left = prediction.bounding_box.left * img_width
top = prediction.bounding_box.top * img_height
height = prediction.bounding_box.height * img_height
width = prediction.bounding_box.width * img_width
# Create a rectangle
draw.rectangle((left, top, left+width, top+height), outline=color, width=lineWidth)
# Display probabilities
font = ImageFont.truetype("arial.ttf", 18)
draw.text((left, top-20), f"{prediction.probability * 100 :.2f}%", fill=color, font=font)
img.save("result.jpg")
First, install OpenCV using the following command:
pip install opencv-python
We will use OpenCV to get an image from the camera, then we will analyze the image using our Custom Vision model and display a bounding box around every detected object.
test-camera.py
) and import the following libraries.import cv2
from azure.cognitiveservices.vision.customvision.prediction import CustomVisionPredictionClient
from msrest.authentication import ApiKeyCredentials
# Create variables for your project
publish_iteration_name = "Iteration4"
project_id = "<YOUR_PROJECT_ID>"
# Create variables for your prediction resource
prediction_key = "<YOUR_KEY>"
endpoint = "<YOUR_ENDPOINT>"
prediction_credentials = ApiKeyCredentials(in_headers={"Prediction-key": prediction_key})
predictor = CustomVisionPredictionClient(endpoint, prediction_credentials)
camera = cv2.VideoCapture(0, cv2.CAP_DSHOW)
camera.set(cv2.CAP_PROP_FRAME_WIDTH, 640)
camera.set(cv2.CAP_PROP_FRAME_HEIGHT, 480)
ret, image = camera.read()
cv2.imwrite('capture.png', image)
with open("capture.png", mode="rb") as captured_image:
results = predictor.detect_image(project_id, publish_iteration_name, captured_image)
# Select color for the bounding box
color = (0,255,0)
# Display the results
for prediction in results.predictions:
if prediction.probability > 0.5:
left = prediction.bounding_box.left * 640
top = prediction.bounding_box.top * 480
height = prediction.bounding_box.height * 480
width = prediction.bounding_box.width * 640
result_image = cv2.rectangle(image, (int(left), int(top)), (int(left + width), int(top + height)), color, 3)
cv2.putText(result_image, f"{prediction.probability * 100 :.2f}%", (int(left), int(top)-10), fontFace = cv2.FONT_HERSHEY_SIMPLEX, fontScale = 0.7, color = color, thickness = 2)
cv2.imwrite('result.png', result_image)
camera.release()
In this article, you learned how to create an object detection model in Azure Custom Vision and use a Custom Vision model in a Python app. If you are interested in learning more about Azure Custom Vision, check out these Microsoft Learn modules:
Share your awesome Custom Vision projects and feel free to reach out to me on LinkedIn or Twitter.
If you have finished learning, you can delete the resource group from your Azure subscription:
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.