Model deployment is one of the most critical components in machine learning systems. Model deployment in Azure Machine Learning (AzureML) is evolving. So far, AzureML supports Azure Container Instances (ACI) and Azure Kubernetes Service (AKS) as traditional seamless deployment targets for models.
Recently, at Build 2022, we released managed online endpoints to provide a unified interface to invoke and manage model deployments on Microsoft-managed compute in a turnkey manner. You can take advantage of scalable and reliable endpoints without being concerned about infrastructure management. Already, our several customers and partners are utilizing the inference capability to automate model deployments toward production use.
The developer experience in AzureML is also evolving. The AzureML CLI v2 and Python SDK v2 no longer support legacy ACI web services. Upgrade to v2 is highly recommended to take full advantage of the consistency and new features to accelerate the machine learning lifecycle in v2 production environments.
In this blog, we summarize the benefits of leveraging the managed online endpoints, cost comparison, and introduce how you can transition from your existing ACI workloads to managed online endpoints.
*As of September 2022, ACI web services are in maintenance mode and will not be invested in new features.
*AzureML CLI v1 is getting retired on 30 Sep 2025, see CLI & SDK v2 for details.
What managed online endpoints bring you
Managed online endpoints handle serving, scaling, securing, and monitoring of your machine learning models without being concerned about the underlying infrastructure. In particular, the recommended deployment purpose of ACI web services was for dev/test environments, while managed online endpoints is designed for use in production environments.
Here are some benefits of using managed online endpoints:
- Optimizing cost
- Wider options for VM SKUs, GPU optimized inference with Triton, more scalable than ACI
- Autoscaling by schedule based, metrics-based, and combinations
- View costs at endpoint and deployment level
- Streamline development
- Declarative deployment with YAML, easy to use for GitOps
- Locally debug deployment code & dependencies with VS Code
- Multiple deployments with different traffic settings
- Streamline operations
- Managed infrastructure, security enhancements including network isolation
- Safe rollout of new deployment and controlled rollout of in-place update
- Logs, application diagnostics & advanced performance monitoring
Upgrade guidance
There are two approaches to upgrade:
- Deploy to managed online endpoints by yourself using the model and environment you deployed to ACI.
You can use AzureML CLI v2, Python SDK v2, and REST API to deploy your models for managed online endpoints. This is highly recommended for customers who regularly create and delete ACI services.
- Use upgrade tools.
We provide documents and scripts to support upgrade. This tool will automatically create new online endpoint, your original services won't be affected. You can safely route the traffic to the new endpoint and then delete the old one.
There are a few things to note when upgrading from ACI web service:
- The scoring URL will change. For example, the scoring URL for ACI web services was like http://aaaaaa-bbbbb-1111.westus.azurecontainer.io/score, but for managed online endpoints, it will be like https://endpoint-name.westus.inference.ml.azure.com/score
- As AzureML CLI v1 and Python SDK v1 no longer support managed online endpoints operations, please use the CLI/SDK v2 or REST APIs.
- For legacy ACI model deployment, you can specify CPU/Memory requirements. For managed online endpoints, you can only specify VM SKUs to be used.
Please refer to the following example of mapping CPU/Memory to corresponding SKUs.
Table 1. Suggested VM SKU for different resource requirements of ACI web services.
ACI resource requirements |
Suggested SKU |
|
CPU |
Memory (GB) |
|
(0, 1] |
(0, 1.2] |
DS1 V2 |
(1, 2] |
(1.2,1.7] |
F2s V2 |
(1.7, 4.7] |
DS2 V2 |
|
(4.7, 13.7] |
E2s V3 |
|
(2, 4] |
(0, 5.7] |
F4s V2 |
(5.7, 11.7] |
DS3 V2 |
|
(11.7, 16] |
E4s V3 |
* "(" means greater than and "]" means less than or equal to. For example, “(0, 1]” means “greater than 0 and less than or equal to 1”.
Cost comparison
When upgrading from ACI, it's important to note that there will be some changes in how you'll be charged. Please use the information here to help you choose the right VM SKUs for your workload.
You can also take advantage of Reserved Instances to reduce costs if you anticipate steady usage over a period of time (one-year or three-year).
Table 2. Approximate cost comparison of ACI web services and managed online endpoints (example for East US 2 region, USD$).
ACI resource requirements |
ACI costs Range / Per month (USD$) |
Suggested SKU |
SKU costs (USD$) |
|||
CPU cores |
Memory (GB) |
|
|
Pay as you go / Per month |
1 year reserved / Per month |
3 years reserved / Per month |
(0, 1] |
(0, 1.2] |
($29.565, $33.463] |
DS1 V2 |
$41.610 |
$27.003 |
$17.696 |
(1, 2] |
(1.2,1.7] |
($63.028, $64.652] |
F2s V2 |
$61.758 |
$36.500 |
$22.638 |
(1.7, 4.7] |
($64.652, $74.398] |
DS2 V2 |
$83.220 |
$54.086 |
$35.391 |
|
(4.7, 13.7] |
($74.398, $103.634] |
E2s V3 |
$97.090 |
$57.086 |
$36.500 |
|
(2, 4] |
(0, 5.7] |
($88.695, $107.211] 3 cores |
F4s V2 |
$123.37 |
$73.000 |
$45.275 |
($118.26, $136.776] 4 cores |
||||||
(5.7, 11.7] |
($107.211, $126.702] 3 cores |
DS3 V2 |
$167.170 |
$108.165 |
$70.781 |
|
($136.776, $156.267] 4 cores |
||||||
(11.7, 16] |
($126.702, $140.671] 3 cores |
E4s V3 |
$194.180 |
$114.165 |
$73.000 |
|
($156.267, $170.236] 4 cores |
* Azure costs differ based on the region you use and may change, please refer to the latest pricing.
* ACI cost is calculated by 29.5650 * X + 3.2485 * Y. (X is the CPU core request rounded up to the nearest number, Y is the memory GB request rounded up to the nearest tenths place)
Getting started today
- Learn more about managed online endpoints and difference between v1 and v2 assets in Azure Machine Learning.
- Try our upgrade tool to easily upgrade your existing workloads and take full advantage of managed online endpoints.
If you have any questions or feedback, please post a comment on this article or create an issue.
Updated Oct 25, 2022
Version 11.0Shohei_Nagata
Microsoft
Joined February 01, 2021
AI - Machine Learning Blog
Follow this blog board to get notified when there's new activity