Microsoft Foundry Blog

7 MIN READ

How to use Cognitive Services and containers

Microsoft

Feb 08, 2021

In this blog we are going to take a look at how we can run a selection of Cognitive Services in Docker compatible containers. This option of using these services can come in handy if you run into scenarios where your application can not connect to the cloud all the time or if you need more control over your data.

What are Cognitive Services

Azure Cognitive Services are cloud-based services that expose AI models through a REST API. These services enable you to add cognitive features, like object detection and speech recognition to your applications without having data science skills. By using the provided SDKs in the programming language of your choice you can create application that can see (Computer Vision), hear (Speech), speak (Speech), understand (Language), and even make decisions (Decision).

Cognitive Services in containers

Azure Cognitive Service in containers gives developers the flexibility in where to deploy and host the services that come with Docker containers and keeping the same API experience as when they where hosted in the Azure.

Using these containers gives you the flexibility to bring Cognitive Services closer to your data for compliance, security or other operational reasons.

What are containers
Containerization is an approach to software distribution in which an application or service, including its dependencies & configuration, is packaged together as a container image. With little or no modification, a container image can be deployed on a container host. Containers are isolated from each other and the underlying operating system, with a smaller footprint than a virtual machine. Containers can be instantiated from container images for short-term tasks, and removed when no longer needed.

When to use Cognitive Services in containers?

Running Cognitive Services in containers can be the solution for you if you have specific requirements or constraints making that make it impossible to run Cognitive services in Azure. The most common scenarios include connectivity and control over the data. If you are running Cognitive Services in Azure all the infrastructure is taken care of, running them in containers moves the infrastructure responsibility, like performance and updating the container, to you.

A case where you choose for container could be, if your connection to Azure is not stable enough. For instance if you have 1000's of document on-prem and you want to run OCR. If you use the Computer Vision OCR endpoint in the cloud you would need to send all the documents to the end point in azure, while if you run the container locally you only need to send the billing information every 15 minutes to Azure.

Features and benefits

Immutable infrastructure: Enable DevOps teams' to leverage a consistent and reliable set of known system parameters, while being able to adapt to change. Containers provide the flexibility to pivot within a predictable ecosystem and avoid configuration drift.

Control over data: Choose where your data gets processed by Cognitive Services. This can be essential if you can't send data to the cloud but need access to Cognitive Services APIs. Support consistency in hybrid environments – across data, management, identity, and security.

Control over model updates: Flexibility in versioning and updating of models deployed in their solutions. Portable architecture: Enables the creation of a portable application architecture that can be deployed on Azure, on-premises and the edge. Containers can be deployed directly to Azure Kubernetes Service, Azure Container Instances, or to a Kubernetes cluster deployed to Azure Stack. For more information, see Deploy Kubernetes to Azure Stack.

High throughput / low latency: Provide customers the ability to scale for high throughput and low latency requirements by enabling Cognitive Services to run physically close to their application logic and data. Containers do not cap transactions per second (TPS) and can be made to scale both up and out to handle demand if you provide the necessary hardware resources.

Scalability: With the ever growing popularity of containerization and container orchestration software, such as Kubernetes; scalability is at the forefront of technological advancements. Building on a scalable cluster foundation, application development caters to high availability.

Which services are available

Container support is currently available for a subset of Azure Cognitive Services, including parts of:

Group	Service	Documentation
Anomaly Detector	Anomaly Detector	Documentation
Computer Vision	Read OCR (Optical Character Recognition)	Documentation
	Spatial Analysis	Documentation
Form Recognizer	Form Recognizer	Documentation
Language Understanding	Language Understanding	Documentation
Speech	Custom Speech-to-text	Documentation
	Custom Text-to-speech	Documentation
	Speech-to-text	Documentation
	Text-to-speech	Documentation
	Neural Text-to-speech	Documentation
	Speech language detection	Documentation
Text Analytics	Key Phrase Extraction	Documentation
	Text language detection	Documentation
	Sentiment analysis	Documentation
Face	Face	Documentation

How to use Cognitive Services in containers

The use of the services in containers is exactly the same as if you would use them in Azure. The deployment of the container is the part that takes a bit of planning and research. The services are shipped in Docker Containers. This means that the containers can be deployed to any Docker compatible platform. This can be your local machine running Docker Desktop or a fully scalable Kubernetes installation in your on premise data center.

Generic workflow

Create the resource in Azure
Get the endpoint
Retrieve the API Key
Find the container for the service
Deploy the container
Use the container endpoint as you would use the API resource

Optional you can mount your own storage and connect Application Insights.

Tutorial: Run a Text to Speech container in an Azure Container Instance.

In this tutorial we are going to run a Cognitive Service Speech container in an Azure Container Instance and use the REST API to convert text into speech.

To run the code below you need an Azure Subscription. if you don’t have an Azure subscription you can get $200 credit for the first month. And have the Azure command-line interface installed. If you don't have the Azure CLI installed follow this tutorial.

1. Create a resource group

Everything in Azure always start with creating a Resource Group. A resource group is a resource that holds related resources for an Azure solution.

To create a resource group using the CLI you have to specify 2 parameters, the name of the group and the location where this group is deployed.

az group create --name demo_rg --location westeurope

2. Create Cognitive Service resource

The next resource that needs to be created is a Cognitive Services. To create this resource we need to specify a few parameters. Besides the name and resource group, you need to specify the kind of cognitive service you want to create. For our tutorial we are creating a 'SpeechServices' service.

az cognitiveservices account create \
    --name speech-resource \
    --resource-group demo_rg \
    --kind SpeechServices \
    --sku F0 \
    --location westeurope \
    --yes

3. Get the endpoint & API Key

If step 1 and 2 are successfully deployed we can extract the properties we need for when we are going to run the container in the next step. The 2 properties we need are the endpoint URL and the API key. The speech service in the container is using these properties to connect to Azure every 15 minutes to send the billing information.

To retrieve endpoint:

az cognitiveservices account show --name speech-resource --resource-group demo_rg  --query properties.endpoint -o json

To retrieve the API keys:

az cognitiveservices account keys list --name speech-resource --resource-group demo_rg

3. Deploy the container in an ACI

One of the easiest ways to run a container is to use Azure Container Instances. With one command in the Azure CLI you can deploy a container and make it accessible for the everyone.

To create an ACI it take a few parameters. If you want your ACI to be accessible from the internet you need to specify the parameter: '--dns-name-label'. The URL for the ACI will look like this: http://{dns-name-label}.{region}.azurecontainer. The dns-name-label property needs to be unique.

az container create 
    --resource-group demo_rg \
    --name speechcontainer \
    --dns-name-label <insert unique name> \
    --memory 2 --cpu 1 \
    --ports 5000 \
    --image mcr.microsoft.com/azure-cognitive-services/speechservices/text-to-speech:latest \
    --environment-variables \
        Eula=accept 
        Billing=<insert endpoint> 
        ApiKey=<insert apikey>

The deployment of the container takes a few minutes.

4. Validate that a container is running

The easiest way to validate if the container is running, is to use a browser and open the container homepage. To do this you first need to retrieve the URL for the container. This can be done using the Azure CLI with the following command.

az container show --name speechcontainer --resource-group demo_rg --query ipAddress.fqdn -o json

Navigate to the URL on port 5000. The URL should look like this: http://{dns-name-label}.{region}.azurecontainer.io:5000/

If everything went well you should see a screen like this:

5. Submit your first task

The Text to Speech service in the container is a REST endpoint. To use it we would need to create a POST request. There are many ways to do a POST request. For our tutorial we are going to use Visual Studio Code to do this.

Requirements:

Download and Install Visual Studio Code
Install a plugin called REST Client

If you have the Visual Studio Code with th REST Client installed create a file call: rest.http and copy past the code below in the file.

POST http://<dns-name-label>.<region>.azurecontainer.io:5000/speech/synthesize/cognitiveservices/v1  HTTP/1.1
Content-Type: application/ssml+xml
X-Microsoft-OutputFormat: riff-24khz-16bit-mono-pcm
Accept: audio/*

<speak version="1.0" xml:lang="en-US">
    <voice name="en-US-AriaRUS">
        The future we invent is a choice we make. 
        Not something that just happens.
    </voice>
</speak>

Change the name of the URL to the URL of your ACI.
Next click on the Send Request link (just above the URL)

On the right side of VS Code you should see the response of the API. In the top right corner you see "Save Response Body" click on the button and save the response as a .wav file. Now you can use any media player to play the response.