Azure Machine Learning service expands support for MLflow (Public Preview)
Many data scientists start their machine learning projects using Jupyter notebooks or editors like Visual Studio Code. To ensure models can be used in production, it is essential to systematically track all aspects of an ML workflow, such as the data, environment, code, and models produced. These challenges with reproducibility can become complex when working in a hybrid cloud environment – but are mitigated if both environments conform to open standards.
AzureML’s support for MLflow
Azure ML now supports managing the end to end machine learning lifecycle using open MLflowstandards, enabling existing workloads to seamlessly move from local execution to the intelligent cloud & edge.Azure Machine Learning has expanded support for running machine learning workflows to train, register and deploy models via native integration (API compatibility) with MLflow.
Let’s walk through some of the latest enhancements to the Azure ML and MLflow interoperability.
MLflow Projects provide a way to organize and describe your code to enable other data scientists or automated tools to run it. Any local directory or Git repository can be treated as an MLflow project. You can enable MLflow's tracking URI and logging API, collectively known as MLflow Tracking, to connect your MLflow experiments and Azure Machine Learning. You can submit your MLflow experiments locally or remotely using MLflow Projects with full tracking support in AzureML by setting the project backend to “azureml”.
A project includes the following:
Conda environment specification (conda.yaml)
Any .py or .sh file in the project can be an entry point, with no parameters explicitly declared. When you run the command with a set of parameters, MLflow passes each parameter on the command line using --key <value> syntax.
You specify more options by adding an MLproject file, which is a text file in YAML syntax. An example MLproject file looks like this:
Here’s an example set up for a local run. I’ve set the backend to “azureml” to get all the tracking support and error logging from Azure ML. The backend config object is used to store necessary information such as the compute target, local managed environment or a system managed environment.
In the image below you can see that Azure ML automatically tags the run with MLflow related metadata for visibility and logs the git info.
You can then log and visualize your run metrics in Azure Machine Learning Studio or the MLflow Experimentation UI.
You can see the same metrics in the Azure ML studio and MLflow UI.
MLflow Model Registry and Deployment
With the new support for the MLflow model format, it becomes even easier to track and deploy models on Azure ML. You can register models from local files or a run and use it to make predictions online or in batch mode. By deploying models as a web service, you can apply the Azure Machine Learning monitoring and data drift detection functionalities to your production models. Let's look at an example:
From the MLflow project run, you can see the output model from the projects run is registered following the MLflow model schema.
The MLmodel file contains all the model details and metadata.
If you want to register, containerize, and deploy the model, you can now do that in one step. Using the mlflow.azureml.deploy() Python SDK method, AzureML will register the model in AzureML, build the docker container and deploy it to the chosen target. The deployed service will also retain the MLflow metadata as tags as show in the image below.
With the continuous support for MLflow, Azure ML is committed to being interoperable with Open source standards providing flexibility for users to work on-prem or on the cloud. To get more details about the Mlflow and Azure ML integration check out the following links: