An Adaptive 10-Step Approach to Deploy ML Model

Kritika_Mishra · ‎Nov 09 2021

Writeup image.png

Model deployment is often the last step in a Machine learning Product Lifecycle. This single step drives model adoption in multitude of ways. Needless to say, if done without precise and thoughtful planning, it can lead to significant problems across multiple aspects of the customer facing service, including but not limited to, security, availability and eventually consumption, which can ultimately result in the wastage of efforts spent in the creation of the model in the first place.

After realizing the necessity of applying careful considerations on the deployment of ML models, I started exploring sophistic ways of deployment.

Exploration was centered around 3 aspects:

Ease of use
Adaptiveness for myriad scenarios
Multiple granular controls among which Security being the foremost

Exploration resulted in fruition. I found a holistic and comprehensive offering within Azure with absolute focus on Machine learning Product Lifecycle, replete with the goodness of Azure, which goes by the name of Azure Machine Learning.

The rest of the article focuses on the deployment of ML model as a secured webservice along with its consumption as a RESTful API with Azure Machine Learning, all using Python!

Prerequisites

Before moving ahead, we need to have 5 things:

Azure account
Azure Subscription - Azure ML free tier doesn’t require Azure subscription. So, I would suggest testing with free tier first, keeping in mind that there could be some feature limitations. A straightforward way to wholly use the service is to sign up for a 30-day Free trial Azure Subscription. The Pricing and free tier details can be checked on the link given in the Additional Resources section.
Pickle file of a pre-trained model - For the purpose of this article, I have created and pickled a Linear regression-based model that can predict selling price of the houses. I have used Boston housing dataset available within scikit-learn library for training the model.
Azure ML SDK for python - It can be installed from command line via pip using “pip install azureml-core” command.
Python IDE like Jupyter Notebook, Visual Studio Code etc.

Workflow

The workflow involves 10 simple steps.

1. Connect to your Azure account:

from azureml.core.authentication import InteractiveLoginAuthentication
interactive_authentication = InteractiveLoginAuthentication(tenant_id="TENANT ID", force=True)

This will complete the authentication and establish a remote connection with the Azure account.

Available controls

tenant_id refers to the tenant of Azure Active Directory. To get it, search for Azure Active Directory within Azure portal and fetch the Tenant ID present within Tenant information section.

2. Create Azure ML workspace:

Workspace is the top-level resource for Azure ML, which provides a centralized place to work with all the artifacts created using Azure ML.

from azureml.core import Workspace
ml_workspace = Workspace.create(name='HousePricePrediction_ws',
                      subscription_id='SUBSCRIPTION ID', 
                      resource_group='HousePricePrediction_rg',
                      location='GEOLOCATION'
                     )
ml_workspace.write_config()

This will create and return a workspace referencing instance after deploying dependent resources i.e., app insights, storage account and key vault.

Available controls

name is the unique name for the workspace.
subscription_id refers to the specific subscription which needs to be used, more meaningful in case you have bought multiple subscriptions. To get it, search for Subscriptions within Azure portal and fetch the Subscription ID of the Subscription you wish to use.
resource_group refers to the name of the logical group of deployed resources. You can either provide an older resource group or create a new one.
location refers to the deployment region of the resource. Check out the eligible regions on the link given in the Additional Resources section.

3. Register ML model:

The process of uploading a model separately from webservice code is known as registering of model. Azure ML separates client facing webservice code from the ML model to allow their individual updation.

from azureml.core.model import Model
ml_model = Model.register(model_path = "BostonHousePricePrediction.pickle",
                       model_name = "HousePricePrediction_Model",
                       description = "To predict the price of houses",
                       workspace = ml_workspace)

This will upload the model to the cloud within referenced workspace storage account.

Available controls -

model_path is the local file path of the model.
model_name is the unique name of the model within the workspace.
description is a short description of the model.
workspace is a reference to the Azure Workspace in which model is going to register.

4. Define an entry script:

Entry script acts as an intermediary between model and deployed webservice. It receives data submitted to the webservice and passes it to the model. It then takes the response and return it to the client.

%%writefile score.py
import json
import numpy as np
import os
import joblib
import pickle

def init():
    global model
    model_path = os.path.join(os.getenv('AZUREML_MODEL_DIR'), 'BostonHousePricePrediction.pickle')
    model = joblib.load(model_path)

def run(request_data):
    request_data = json.loads(request_data)['data']
    data = np.array(request_data)
    predicted_price = model.predict(data)
    return predicted_price.tolist()

This will create a python script (score.py) locally consisting of 2 functions:

init - To load the registered model.
run - To execute the model for each prediction request. The request is received in json format, and the input lies within the “data” element of the json structure.

5. Define dependency file:

from azureml.core.conda_dependencies import CondaDependencies
dependency = CondaDependencies()
dependency.add_conda_package("scikit-learn")
dependency.add_pip_package("azureml-model-management-sdk")

with open("HousePricePredictionEnv.yml","w") as file:
    file.write(dependency.serialize_to_string())

This will create a dependency file (HousePricePredictionEnv.yml) locally specifying python libraries required by the environment to run the model.

6. Create an inference configuration:

Inference configuration specifies the entry script, runtime and dependency file required to set up the environment needed by both model and the webservice.

from azureml.core.model import InferenceConfig
inference_config = InferenceConfig(entry_script='score.py', 
                                   runtime="python", 
                                   conda_file="HousePricePredictionEnv.yml")

Available controls

entry_script is the local file path of entry script.
runtime refers to the language runtime.
conda_file is the local file path of dependency file.

7. Select deployment/compute target:

Compute target is the computing platform used to host webservice.

Below diagram can guide you in choosing and determining deployment target.

This selection primarily depends on the cost, availability, and processing requirements of the service.

Some examples:

Azure Kubernetes Service (AKS) is used for high scale deployment where autoscaling is required.
Azure Container Instances (ACI) is used for low scale deployment. It is suitable for small models that are less than 1 GB in size.

In this article, we will use AKS as the deployment target.

An Azure Kubernetes Service (AKS) creation requires the attachment of AKS cluster to the Azure workspace.

from azureml.core.compute import ComputeTarget,AksCompute
provision_config = AksCompute.provisioning_configuration()
deployment_target = ComputeTarget.create(workspace=ml_workspace, 
                                    name='Predictionaks-1', 
                                    provisioning_configuration=provision_config)

if deployment_target.get_status() != "Succeeded":
    deployment_target.wait_for_completion(show_output=True)

Available controls

workspace is a reference to the Azure Workspace.
name is a unique name for the AKS cluster.
provisioning_configuration determines the type of Compute object to provision, which in this case is AKS.

8. Create a deployment configuration:

Deployment configuration defines the characteristic of the compute/deployment target on granular levels e.g., number of CPU cores, amount of memory etc.

from azureml.core.webservice import AksWebservice
deployment_config = AksWebservice.deploy_configuration(cpu_cores=1, 
                                               memory_gb=1, 
                                               auth_enabled = True,
                                               enable_app_insights = True, 
                                               description='Service to predict house prices')

Available controls

cpu_cores is the number of CPU cores.
memory_gb is the amount of memory (in GB).
auth_enabled enables key based authentication for the RESTful service.
enable_app_insights enable Azure App Insights in order to analyze the usage of webservice.
description is a short description of the webservice.

9. Deploy ML model as Azure Kubernetes Service/API:

service = Model.deploy(workspace=ml_workspace,
                           name="housepriceprediction-service",
                           models=[ml_model],
                           inference_config=inference_config,
                           deployment_config=deployment_config,
                           deployment_target=deployment_target)

service.wait_for_deployment(show_output = True)
print(service.state)

This will deploy the model and create a consumable RESTful API/webservice.

Available controls

workspace is a reference to the Azure Workspace.
name is a unique name for the webservice.
models is a reference to the registered ML model.
inference_config is a reference to the inference config.
deployment_config is a reference to the deployment config.
deployment_target is a reference to the deployment target.

10. Consume the API:

It involves 3 steps:

Get the service URI and access keys from the instance of deployed service.

service_uri = service.scoring_uri
primary, secondary = service.get_keys()

Set the parameters of the header and use one out of the two keys to fill authorization header.

import requests
import json
scoring_uri = service_uri
key = primary
headers = {'Content-Type': 'application/json'}
headers['Authorization'] = f'Bearer {key}'

Send the POST request and get prediction as response.

input_data = [[0.06724, 0.0, 3.24, 0.0, 0.46, 6.333, 17.2, 5.2146, 4.0, 430.0, 16.9, 375.21, 7.34],
              [9.2323, 0.0, 18.1, 0.0, 0.631, 6.216, 100.0, 1.1691, 24.0, 666.0, 20.2, 366.15, 9.53]]
input_data = json.dumps({'data': input_data})
response = requests.post(scoring_uri, data=input_data, headers=headers)
print(response.text)

Parting thoughts

Azure Machine Learning provides plethora of options for model deployment from Azure Kubernetes to Azure IOT and many more, the good thing is that the workflow remains reasonably consistent across all, with small obvious changes required in defining deployment configuration.

The aim of this article is to get you started with the creation of a simple and secured webservice for ML models by using cardinal controls that are needed for each operation. Many more controls are available, which can be utilized based on the requirements.

I hope this article will ease your model deployment journey and help you in creating resilient webservices.

Additional Resources

Azure Machine Learning - ML as a Service | Microsoft Azure

How to deploy machine learning models - Azure Machine Learning | Microsoft Docs

Azure Machine Learning SDK for Python - Azure Machine Learning Python | Microsoft Docs

Deploy ML models to Kubernetes Service - Azure Machine Learning | Microsoft Docs

Pricing - Machine Learning | Microsoft Azure

Azure Products by Region | Microsoft Azure

Products (50)

Special Topics (27)

Video Hub (462)

Most Active Hubs

Most Active Hubs

Video Hub

An Adaptive 10-Step Approach to Deploy ML Model