Blog Post

Microsoft Foundry Blog

5 MIN READ

Continuously Monitor the Performance of your AzureML Models in Production

Microsoft

May 23, 2023

We are thrilled to announce the public preview of Azure Machine Learning model monitoring, allowing you to effortlessly monitor the overall health of your deployed models. Model monitoring is an essential part of the cyclical machine learning lifecycle, encompassing both data science and operational aspects of tracking model performance in production. Changes in data and consumer behavior can influence your model, causing your AI systems to become outdated. This may result in reduced model performance in production, adversely affecting business outcomes and potentially leading to compliance concerns in highly regulated environments. With AzureML model monitoring, you can receive timely alerts about critical issues, analyze results for model enhancement, and minimize the numerous inherent risks associated with deploying ML models.

Capabilities of AzureML model monitoring

AzureML model monitoring provides the following capabilities:

Simple model monitoring configuration with AzureML online endpoints. If you deploy your model to production with AzureML online endpoints, AzureML collects production inference data automatically and uses it for continuous model monitoring, providing you with an easy configuration process.
Pre-configured and customizable monitoring signals. Model monitoring supports a variety of configurable monitoring signals for tabular datasets, including data drift, prediction drift, data quality, and feature attribution drift. You can choose your preferred metric(s) and adjust alert thresholds for each signal. If the pre-configured signals don't suit your needs, create a custom monitoring signal component tailored to your business scenario.
Use of recent past production data or training data as comparison baseline dataset. For model signals and metrics, AzureML lets you set these datasets as the baseline dataset for comparison, enabling you to monitor for both drift and skew.
Monitoring of data drift or data quality based on feature importance explanations. If you use training data as your comparison baseline dataset, you can define data drift or data quality signals and monitor only the most important features for your predictions, saving costs.
Analyze monitoring metrics from a comprehensive UI. View change in drift metrics over time, see which features are violating defined thresholds, and analyze your baseline and production feature distributions side-by-side within a comprehensive monitoring UI.

AzureML model monitoring signals

Evaluating the performance of a production ML system requires examining various signals, including data drift, model prediction drift, data quality, and feature attribution drift. Such shifts can lead to outdated models: by identifying these shifts, organizations can proactively implement measures like model retraining to maintain optimal model performance and minimize risks associated with outdated or mismatched data.

Data drift: Monitoring data drift is vital for maintaining the accuracy and performance of machine learning models in production. AzureML allows you to detect changes in data distributions, mitigating risks associated with outdated or mismatched data.
Prediction drift: Significant changes in a model's prediction distribution may indicate prediction drift, which can result from shifts in data or code. AzureML’s proactive monitoring of model outputs aids you in identifying issues within the model as it responds to these data shifts.
Data quality: Maintaining data quality is essential, as errors in upstream data processing can lead to unexpected model behavior. Changes in data sources, schemas, logging, or upstream features generated by other ML models can impact your model significantly. AzureML detects data issues such as null values, range violations, or type mismatches, ensuring optimal performance and enabling you to proactively fix issues.
Feature attribution drift: Changes in feature importance distributions between training and production may signify feature attribution drift, potentially indicating unexpected model behavior. AzureML helps you evaluate each feature's influence on predictions by tracking their contributions over time and detecting shifts in feature importance, which helps identify unexpected behavior and potential accuracy impacts.

For a complete overview of AzureML model monitoring signals and metrics, take a look at this document.

How to enable AzureML model monitoring

Take the following steps to enable model monitoring in AzureML:

Enable production inference data collection. If you deploy a model to an AzureML online endpoint, you can enable production inference data collection by using AzureML Model Data Collector. If you deploy your model to an AzureML batch endpoint or outside of AzureML, you're responsible for collecting your own production inference data, which can then be used for AzureML model monitoring.
Configure model monitoring. You can use AzureML’s SDK, CLI, or the Studio UI to easily set up model monitoring. During setup, you can specify your preferred monitoring signals, configure your desired metrics, and set the respective alert threshold for each metric.
View and analyze model monitoring results. Once model monitoring is configured, a monitoring job is scheduled, which calculates and evaluates metrics for all selected monitoring signals, and triggers alert notifications whenever a specified threshold is exceeded. You can follow the link in the alert notification to your AzureML workspace to view and analyze monitoring results.

Step 1: During AzureML Model Monitoring set-up, users can configure the signals and metrics to monitor the performance of their model in production.

Step 2: After model monitoring is configured, users can view a comprehensive overview of signals, metrics, and alerts in AzureML’s Monitoring UI.

Step 3: For a specific drift signal, users can view the metric change over time in addition to a histogram displaying the baseline distribution compared to the production distribution.

AzureML model monitoring best practices

Each machine learning model and its use cases are unique. Therefore, model monitoring is unique for each situation. The following is a list of recommended best practices for model monitoring:

Start monitoring your model as soon as it is deployed to production. The sooner you begin monitoring your production model, the sooner you will be able to identify issues and resolve them.
Work with data scientists that are familiar with the model to set up model monitoring. These data scientists have insight into the model and its use cases. They are best positioned to recommend the best monitoring signals, metrics, and alert thresholds to use, thereby reducing alert fatigue.
Include multiple monitoring signals in your monitoring setup. With multiple monitoring signals, you get both a broad view of your model’s health in addition to granular insights into model performance. For example, you can combine both data drift and feature attribution drift signals to get an early warning about a model performance issue.
Use model training data as the baseline dataset. For comparison based on the baseline dataset, AzureML allows you to use the recent past production data or historical data (such as training data or validation data). For a meaningful comparison, we recommend that you use the training data as the comparison baseline for data drift and data quality. For prediction drift, we recommend using the validation data as the comparison baseline.
Specify the monitoring frequency based on how your production data will change over time. For example, if your production model has a large amount of daily traffic, and the daily data accumulation is sufficient for you to monitor, then you can configure your model monitor to run on a daily basis. Otherwise, you can consider a weekly or monthly monitoring frequency, based on the growth of your production data over time.

Monitor the top N important features or a subset of features. If you use training data as your comparison baseline by default, AzureML monitors data drift or data quality for the top 10 important features. For models that have a large number of features, consider monitoring a subset of those features to reduce both computation costs and monitoring noise.

Get started with AzureML model monitoring today

Get started with AzureML model monitoring today! You can find more information about AzureML Model Monitoring below:

https://aka.ms/azureml-momo/doc

To learn more about AzureML model monitoring, watch these Microsoft Build 2023 breakout sessions:

Updated May 24, 2023

Version 2.0

artificial intelligence

azure machine learning

mlops

alexanderhughes

Microsoft

Joined May 03, 2023

View Profile

Microsoft Foundry Blog

Follow this blog board to get notified when there's new activity

11 Comments

GCLima
Copper Contributor
Feb 26, 2024
Hi wiktormadejski

Did you manage to understand the requirements for this "spark.read.mltable" to work outside of an AML Monitoring Job?
wiktormadejski
Copper Contributor
Nov 13, 2023
Hi alexanderhughes

In the underlying code you are using "spark.read.mltable" (https://github.com/Azure/azureml-examples/blob/9c762dd6bb704579e34b13e322c9e1e99c51b93e/cli/monitoring/components/custom_preprocessing/src/run.py#L22) - I've been troubleshooting this. Attempting to run it on AzureML Notebook with Azure ML Serverless Spark Compute or an attached Synapse Spark pool, but both attempts fail with the error: 'AttributeError: 'DataFrameReader' object has no attribute 'mltable'.

pip install mltable~=1.3.0, will not help.I'd really appreciate any insights on 'spark.read.mltable.' It runs smoothly as part of an AML job, but when attempted in Notebooks on the same Synapse cluster and spark.synapse.library.python.env, it hits a snag. Could there be any additional custom installation required to make validation framework executable on Spark compute? And when you're crafting your validator, what's your go-to debugging strategy?

A bit of feedback (so energy goes both ways;):

Also - I managed to "set up model monitoring by bringing our own production data to Azure Machine Learning" (https://learn.microsoft.com/en-us/azure/machine-learning/how-to-monitor-model-performance?view=azureml-api-2&tabs=azure-cli#set-up-model-monitoring-by-bringing-your-own-production-data-to-azure-machine-learning) with mltable data stored ADLS2 Datastore. It took 2-3 MD, very acceptable but a bit longer than I expected.
In this scenario, the 'pre_processing_component' is non-essential and could be optional, making the CI/CD process simpler.
Model Monitoring" essentially brings the capabilities of the "Data Validation" framework to AML, ensuring data quality before it reaches the training job. It's a fantastic addition for AML systems, and the user interface is incredibly user-friendly.
The existing "advanced_data_quality" comes with validations that significantly enhance the maintainability of ML systems.
The ability to bring in your own custom validators is a standout feature.
As for areas to improve, documentation, tutorials, processing speed, and scalability is a must. Looking at cost-efficiency might easily challange other data validators.
alexanderhughes
Microsoft
Oct 02, 2023
Hello epetrovski,

Thank you for the question. You can absolutely monitor your models deployed with AKS. To do so, please see our documentation here: Monitor performance of models deployed to production (preview) - Azure Machine Learning | Microsoft Learn

The only limitation with not using AzureML online endpoints (managed or kubernetes) is that you must collect production data yourself and register it as an AzureML data asset, which can then be used as input to your monitoring job.

Please let us know if you have any additional questions.

Best,

Alex
epetrovski
Copper Contributor
Oct 02, 2023
Is there a plan to drop the requirement that models need to be deployed to managed Azure or Kubernetes online endpoints for model monitoring to work?

We would very much like to use this feature for models deployed to self managed endpoints in AKS.
wiktormadejski
Copper Contributor
Sep 26, 2023
Fantastic! alexanderhughes greatly appreciate your quick response.
alexanderhughes
Microsoft
Sep 26, 2023
Hello wiktormadejski,

Thank you for your question and for your interest in AzureML Model Monitoring. The underlying code comprising the components which are used for model monitoring are open source, and can be viewed here: azureml-assets/assets/model_monitoring/components/src at main · Azure/azureml-assets (github.com)

Please do let me know if I can help with anything else.
wiktormadejski
Copper Contributor
Sep 26, 2023
Hey Alex,

Thank you and a team for the valuable contribution. And a very nice share!

We are actively reviewing cloud providers that specialize in ML-specific data management solutions. We aim to assess the potential for reusing these components for both batch scoring and custom workflows, even independent of the aesthetically appealing 'Monitoring' dashboard.

Is it possible to inspect the code of the 'azureml' registry components without triggering script downloads by a specific runtime when we review them?

Ex. for data_quality_signal_monitor:
alexanderhughes
Microsoft
Jul 13, 2023
Hi Kevin1165 ,

Thank you for the question. For Public Preview, the only signals we support are Data Drift, Prediction Drift, Data Quality, and Feature Attribution Drift. Support for Concept Drift as a monitoring signal is on our roadmap for future releases of the feature.

Thank you,

Alex
Kevin1165
Copper Contributor
Jul 12, 2023
Hello alexanderhughes ,

Could you please elaborate on how the monitoring feature support concept drift detection? The ground truth for the calculation could be late for 4 weeks in a use case. Thank you.
alexanderhughes
Microsoft
May 26, 2023
Hello edgBR,

Thank you for the comment. I just confirmed with our engineering team and the feature should now be active in all Azure regions. Please refresh your AzureML StudioUI and the Monitoring tab should be present.

Thank you,

Alex