Custom containers in Azure Machine Learning managed online endpoints

Former Employee

Jun 17, 2021

Today, we are announcing the public preview of the ability to use custom Docker containers in Azure Machine Learning online endpoints. In combination with our new 2.0 CLI, this feature enables you to deploy a custom Docker container while getting Azure Machine Learning online endpoints’ built-in monitoring, scaling, and alerting capabilities.

Below, we walk you through how to use this feature to deploy TensorFlow Serving with Azure Machine Learning. The full code is available in our samples repository.

Sample deployment with TensorFlow Serving

To deploy a TensorFlow model with TensorFlow Serving, first create a YAML file:

name: tfserving-endpoint
type: online
auth_mode: aml_token
traffic:
  tfserving: 100

deployments:
  - name: tfserving
    model:
      name: tfserving-mounted
      version: 1
      local_path: ./half_plus_two
    environment_variables:
      MODEL_BASE_PATH: /var/azureml-app/azureml-models/tfserving-mounted/1
      MODEL_NAME: half_plus_two
    environment:
      name: tfserving
      version: 1
      docker:
        image: docker.io/tensorflow/serving:latest
      inference_config:
        liveness_route:
          port: 8501
          path: /v1/models/half_plus_two
        readiness_route:
          port: 8501
          path: /v1/models/half_plus_two
        scoring_route:
          port: 8501
          path: /v1/models/half_plus_two:predict
    instance_type: Standard_F2s_v2
    scale_settings:
      scale_type: manual
      instance_count: 1
      min_instances: 1
      max_instances: 2

Then create your endpoint: