Announcing the general availability of managed online endpoints in Azure Machine Learning

Microsoft

May 25, 2022

We launched public preview of managed endpoints in May 2021 and continued to release new features based on customer feedback. Today we are thrilled to announce the General Availability along with new capabilities in the areas of MLOps and security! We are ready for your production workloads!

To recap, Managed online endpoints handle serving, scaling, securing & monitoring of your machine learning (ML) models, freeing you from the overhead of setting up and managing the underlying infrastructure. In this blog, we will review the feature benefits, see what our users have to say, and learn how you can get started today.

Why you should care

Fully open and flexible ML deployments with no vendor lock-in: We support no-code (just bring your model), low code (bring your code, dependencies), and full code (bring your own complete stack via custom container support). Deploy across a wide range of sku's , use GPU optimized inference using Triton, and leverage MLflow model support.
Ready for scale: Scale to hundreds of thousands of requests per second and scale your deployment to hundreds of nodes (see demo below).
Optimize your costs: Optimize your performance and costs using schedule based and metrics based autoscaling . You can monitor costs at both endpoint and deployment levels.
Streamlined DevOps experience: Leverage our GitOps friendly and fully declarative developer interfaces via CLI/YAML, ARM & REST APIs along with Studio UI support. There is native support for local testing using docker via local endpoints and interactive debugging support using Visual Studio Code. Perform safe rollout of a new version of your model without disrupting production SLA. Monitor SLA and key metrics through Azure Monitor. Query durable logs and perform application diagnostics via integration with Log Analytics and App Insights.
Be complaint with your enterprise security requirements: We support key and Azure ML token auth. You can secure the ingress and egress of the endpoints via private endpoints using our network isolation support. Use managed identity for seamless access to your resources.
Managed infrastructure: Maintaining your own infrastructure is a non-trivial affair: it incurs costs of keeping the nodes up-to-date with security fixes, software/OS patching and version upgrades. Managed endpoints take of all this complexity via managed compute provisioning, update/patch of the underlying host OS images and automated node recovery in case of system failure. It helps ML teams focus more on the business problems.

What's new

1. Safely rollout new version of a model with mirror traffic support (public preview)

With our initial release we supported native blue-green deployments by providing a way to shift traffic gradually to a new deployment. Now we are releasing mirror-traffic support with which you can copy (or 'mirror') a percentage of the live traffic to a new deployment. Mirroring traffic doesn't change results returned to clients. Requests still flow 100% to the production deployment(s). The mirrored percentage of the traffic is copied and submitted to the new deployment so you can gather metrics and logging without impacting your clients. Mirroring is useful when you want to validate a new deployment without impacting clients. For example, to check if latency is within acceptable bounds and that there are no HTTP errors. Learn about the concept here and try a hands-on example here

2. Secure the communications of the endpoint using network isolation support (public preview)

When deploying a machine learning model to a managed online endpoint, you can secure communication of the online endpoint by using private endpoints. By using our declarative API's you can secure the network communications for both ingress and egress of your endpoint and deployment. Learn more about it and try hands-on experience here.

3. We are ready for your high scale production workloads

Have high scale production workloads? Checkout the below demo on how to scale easily using our platform. The demo showcases scaling to 100k request per second in 7 mins.

What our customers have to say

“We make it our mission to try new ideas and go beyond to differentiate AXA UK from other insurers. We see managed endpoints in Azure Machine Learning as a key enabler for our digital ambition.” - Nic Bourven: Chief Information Officer, AXA Insurance UK

Read the full case study here

“At Trapeze, we have been able to predict travel time on bus routes in large transit systems, improving the customer experience for bus riders with the help of Azure Machine Learning. We love the turn key solution Managed Online Endpoints offers for highly scalable ML model deployments along with MLOps capabilities like controlled out, monitoring and MLOps friendly interfaces. This has simplified our AI deployment, reduced technical complexity and optimized cost.” - Farrokh Mansouri| Lead, Data Science | Trapeze Group Americas

“We’re already using Azure Machine Learning to make predictions on the packages,” says Eric Brosch. “We look forward to using managed endpoints to deploy our model and inference at scale as it will decrease the time taken to manage infrastructure, allowing us to focus on the business problem.” - Eric Brosch: Data Scientist Principal, FedEx Services

Get started now!

Managed online endpoints is Generally Available now. We are ready for your production workloads!

Take managed online endpoints for a spin with this end to end tutorial. You can test drive the safe rollout experience using mirror traffic and the network isolation support. You can also brush up on the concepts.

Learn more about Azure at Microsoft Build:

Read how to build productively, collaborate securely, and scale innovation—no matter where in the world with a comprehensive set of Microsoft developer tools and platform.
Read the latest on how Azure powers your app innovation and modernization with the choice of control and productivity you need to deploy apps at scale.
Checkout our hugging face integration here to deploy state of art ML pretrained models swiftly.
Read how you can Innovate faster and achieve greater agility with the Microsoft Intelligent Data Platform and turn your data into decisions.

Updated May 26, 2022

Version 3.0

Sethu_Raman

Microsoft

Joined March 24, 2019

View Profile

Microsoft Foundry Blog

Follow this blog board to get notified when there's new activity