5 Ways Azure Cognitive Services Scale
Published Aug 23 2021 02:37 AM 4,682 Views

Azure Cognitive Services: the AI service that keeps on giving. Although a popular choice for quick development and proof-of-concepts, don’t let that override all the reasons Azure Cognitive Services are built for innovative production workloads. In this blog I want to outline 5 ways Azure Cognitive services can scale to support your mission critical solutions.


The purpose of this blog post is to share how you can access the same AI services that power enterprise products such as Teams and Xbox. For more information, or to get started using Azure AI, check out Artificial intelligence for developers | Microsoft Azure


1. Scaling is dealt with for you


Azure Cognitive Services scales for you. But don’t stop reading now … there is so much more these services can do to help you scale and let me explain how they scale for you.


The word scalability is often used with cloud technologies and is one of the key advantages of building solutions in the cloud. Scalability is the ability to grow your business or service in a pay-as-you-go format, meaning there are no high, up-front investments for machines, and there’s more flexibility on what you can build by choosing the types of technology you need for a specific solution.


The cloud also often helps you scale through scaling features – scaling vertically, such as the size of a virtual machine for example. And scaling horizontally, such as the number of concurrent machines needed for a workload.


So how does Azure Cognitive Services scale for you? Horizontally? Vertically?
It is done in a way that almost reminds me of serverless compute: if I have an Azure function and run it 10 times or 1000 times, you still write the same code.


This is the same for Azure Cognitive Services. For example, if I call the Computer Vision API to describe 10 images, I can also ask it to describe 1000 images without having to change any code or options – it’s already dealt with by the service.


Find out more about scaling and serverless computing in the Azure Fundamentals learning path on Microsoft Learn


Have a look at this demo I created to illustrate this point:

  • Whether submitting one image or 200 images you do not need to make edits to infrastructure or code
  • In this demo I created a Power Automate Flow that triggers when images land in Azure Blob Storage and analyses them
  • Once analysed the result are sent back to Azure Blob Storage for use.


2. Scaling to support growth


A use case of how the automatic Azure Cognitive services scaling can support you, if you suddenly saw an influx of usage of a feature in your application that uses cognitive services, this is great news! But you need to scale your application, scale the front and backend of your application, but Azure Cognitive Services takes care of the scale of the API, so you do not need to worry about this part of your architecture. Only pay for what you use and choose a grouped API key for vision and language services available


Understand how usage of your production application relates to costs: Azure Cognitive Services pricing


3. Scaling your pace of innovation


On to a different type of scaling. Supporting your pace of innovation.


Azure Cognitive Services models are trained by Microsoft on incredibly large datasets, and you can leverage that training and work, rather than building it yourself, therefore being able to focus on the full solution, rather than the machine learning model only. A good example to illustrate the point is speech and text translation. You need rich datasets for many different languages to create quality language and translation models from scratch. Instead using the Azure Cognitive Services, you can simply leverage these models using API calls and scale out your applications around the world faster.


One story I think really illustrates this point is the work done with Vodaphone. Vodaphone are a global telecommunication company and are actively expanding their digital strategy. Vodafone used Microsoft Azure services to develop a personable digital assistant named TOBi who would help support good customer service interaction.


They needed a flexible, scalable technology solution that could match their growth ambitions. It was essential that tools would be easy to use for the conversational designers in their business and built upon modular technologies to support their ambition to make TOBi the biggest chatbot in the world.


They built the bot using the Microsoft Bot Framework, Language Understanding capabilities from Cognitive Services and were also able to easily connect to other non-microsoft technologies within their business easily.


Vodafone first deployed TOBi in Italy and has since rolled out language-specific versions of its bot to 15 other markets. To offer the bot in a country’s local language, Vodafone adopted Translator, part of Azure Cognitive Services, for real-time, AI-based text translation. The company currently makes TOBi available in 15 languages with even more planned. Scaling from 1 to 15 different languages supported was accelerated by using the translator APIs.


Also, the general scale of these APIs also speaks for themselves when they reported in December 2020 that TOBi now holds 25 to 30 million conversations a month with customers and handles 60 percent of their customer interactions.


Have a look at this demo I created to illustrate this point:

  • Azure Cognitive Services can help you scale your innovation and solutions
  • Using Translator APIs, I can convert simple text and full documents
  • As well as translate into multiple languages at once using 2 HTTP requests


4. Scaling you may not know your using


Speaking of scaling in production use cases, there are likely a few technologies you may be using across the Microsoft ecosystem that have Azure Cognitive Services built in. For example, Microsoft Teams, Microsoft PowerPoint, or even other Azure services such as Metrics Advisor.


Part of the Azure Applied AI services, Metrics Advisor uses AI to perform data monitoring and anomaly detection in time series data. Collect your time series data, apply metrics advisor to detect anomalies. And then the most important part, act on the detections and analyze root causes to add value to your IoT project, business process or application




These APIs are, by design, used for enterprise production workloads and built into our first party services for you to use.


5. Scaling through design considerations


Last but certainly not least, there are a couple of design points to consider when it comes to different uses of the Azure Cognitive Services, for example scaling outside of the cloud or scaling to use Big Data.


  1. Running Cognitive Services outside the cloud. For some projects calling cloud APIs may not be ideal or even possible. There are 16 cognitive services within containers outside the cloudOne of the reasons you may not want to call to the cloud is if you have a high throughput / low latency requirement then you would want to run Cognitive Services physically closer to your application logic and data. You can do this by using containers. Containers do not cap transactions per second (TPS) and can be made to scale both up and out to handle demand if you provide the necessary hardware resources

  2. Scaling Cognitive Services applications in Big Data. The Azure Cognitive Services for Big Data lets users channel terabytes of data through Cognitive Services using Apache Spark. With only a few lines of code you can integrate Cognitive Services using the PySpark API in the Microsoft Machine Learning Spark namespace (mmlspark.cognitive). As well as support for Scala and Java too.


Have a look at this demo I created to illustrate this point:

  • If you are using Big Data languages such as Spark, Cognitive services has a library for that mmlspark
  • Using Azure Databricks notebooks, I was able to analyse text and images using the mmlspark library


What's Next?


In this blog post we covered 5 ways Azure Cognitive Services can help you scale your production workloads, from automatically scaling for you, to supporting your innovation pace as well as solutions for edge and big data uses cases.



Version history
Last update:
‎Aug 23 2021 02:36 AM
Updated by: