Using GitHub Actions & Azure Machine Learning for MLOps
Published May 26 2020 02:35 PM 10.3K Views
Microsoft

Background

Like Gitops, Machine Learning Operations (or MLOps) can make significant improvements in accelerating how data scientists can impact organizational needs. A well-implemented MLOps process not only speeds the time from code to production, but also provides ownership, lineage and historical information, critical for understanding the performance of any machine learning model. Critical to this process is a CI/CD system that understands the elements of ML natively as well as stays in sync with any code or data changes, no matter what platform organizations need them to run on.

 

Unfortunately, many data scientists are still forced to implement MLOps manually. Oftentimes CI/CD platforms are powerful, but quite generic, requiring the implementation of “ML aware” through custom code. And, worse, these platforms often require separating the actions from the code, leading to difficulty in debugging and hard to reproduce caching issues. Our goal is to both give these data scientists tools that are easy to implement and use.

 

GitHub Actions and Azure Machine Learning To The Rescue

Today, we’re proud to announce a series of GitHub Actions designed to allow people to implement MLOps with just a few configuration settings, but they are flexible enough to support even complicated workflows. Now, by just checking in your code or opening a pull request, you can kick off an entire ML pipeline, recording all information about the process, and updating that model from the actions.

 

The first five functions we have published are:

  • aml-workspace - Login action to login / connect with Azure Machine Learning
  • aml-compute - Create Compute action to create compute for Azure Machine Learning will allow you to create a new compute target on Azure Machine Learning
  • aml-run - Train action for training machine learning models using Azure Machine Learning
  • aml-registermodel - Save a model to Azure Machine Learning
  • aml-deploy - Deploy action to deploy your model on Azure Machine Learning and creates a real-time endpoint for use in other systems.

NOTE: Though these are all Azure Machine Learning functions, GitHub Actions for MLOps support any cloud.


These actions are based on DevOps principles and practices that increase the efficiency of workflows. For example, continuous integration, delivery, and deployment. We have applied these principles to the machine learning process with the goal of:

  • Faster experimentation and development of models
  • Faster deployment of models into production
  • Quality assurance

 

Not only that, because the entire MLOps as a service is hosted and run on behalf of the users, it frees up time for the ML Engineers to focus on more business critical issues. Additionally, workflows can be updated and added on the back end without the users even knowing, making maintenance of these pipelines even easier.


To show you end-to-end what this would like, we have a short video:

 

Walk-through

Let’s walk you through how you would implement something like this.

Using GitHub Actions and Azure Machine Learning

First, you’ll need some initial setup variables. These include:

  • Azure subscription
  • Contributor access to the Azure subscription
  • Access to GitHub Actions

If you don’t have an Azure subscription, create a free account before you begin. Try the free or paid version of Azure Machine Learning today.

 

Second, create your own repository from the template. You can do this"Use this template" button in the repo: https://aka.ms/ml-template​

 

68747470733a2f2f68656c702e6769746875622e636f6d2f6173736574732f696d616765732f68656c702f7265706f7369746f72792f7573652d746869732d74656d706c6174652d627574746f6e2e706e67.png

 

Third, you’ll need a service principal with contributor rights to a resource group (either new or existing). To create a new one on Azure, use the Azure CLI on your computer and execute the following command to generate the required credentials:

 

# Replace {service-principal-name}, {subscription-id} and {resource-group} with your 
# Azure subscription id and resource group name and any name for your service principle
az ad sp create-for-rbac --name {service-principal-name} \
--role contributor \
--scopes /subscriptions/{subscription-id}/resourceGroups/{resource-group} \
--sdk-auth


This will generate the following JSON output:

 

{
"clientId": "<GUID>",
"clientSecret": "<GUID>",
"subscriptionId": "<GUID>",
"tenantId": "<GUID>",
(...)
}

Add this JSON output as a secret with the name AZURE_CREDENTIALS in your GitHub repository:

 

secrets.png

Please follow this link for more details.


Next, modify the parameters in the /.cloud/.azure/workspace.json file in your repository, so that the GitHub Actions create or connect to the desired Azure Machine Learning workspace.


Once you save your changes to the file, the predefined GitHub workflow that trains and deploys a model on Azure Machine Learning gets triggered. Check the actions tab to view if your actions have successfully run.

 

David_Aronchick_2-1590512009969.png

Now that you have a running pipeline, you can start modifying the code in the code folder so that the pipeline uses your custom code.

 

With just a few configuration settings, you can move from zero to an entire code & GitHub Action driven workflow. In addition to the above actions, we are also publishing two templates that include code and workflow definitions for an end to end ML/AI lifecycle.

 

  1. Simple template repository: ml-template-azure
    Go to this template and follow the getting started guide to set up an ML Ops process within minutes and learn how to use the Azure Machine Learning GitHub Actions in combination. This template demonstrates a very simple process for training and deploying machine learning models.
  2. Advanced template repository: aml-template
    This template demonstrates how approval processes can be included in the process and how training and deployment workflows can be split. It also shows how workflows (e.g. deployment) can be triggered by pull requests. More enhancements will be added to this template in the future to make it more enterprise-ready.

 

Please dive into either repo and let us know if there’s anything we can do to help you achieve your goals with MLOps.


Further, though we’ve implemented the first version using Azure Machine Learning, the platform is flexible enough to support most deployment platforms, both on-prem and on any cloud. Just clone our template repo and customize on your own. And make sure to publish your actions to the GitHub Marketplace so that others can use it!

 

Finally, we very much want to build a community around these actions - please join us at https://aka.ms/ml-template (for the standard template) or https://aka.ms/ml-template-advanced (for the advanced template) to file issues, pull requests and comments about what we can do better. Thank you so much!

 

-- Zander, Marvin, Pulkit & Dave

2 Comments
Version history
Last update:
‎May 26 2020 06:18 PM
Updated by: