In this document, we are going to discuss the approaches that can be taken into consideration while securing the AzureML environment. All the steps are being referenced from docs.microsoft.com. This is one of the ways of securing the AzureML environment. There can be several other approaches of securing the AzureML environment that depend on the organization's requirement. The key objective of our implementation is to manage the incoming and outgoing network communication.
Details for the enterprise security and governance for Azure Machine Learning can be found in this link.
We are going to use the combination of Azure PowerShell and Azure portal to provision and configure the required resources. The steps from the Azure portal that are being performed can be automated, however our focus area is here to set up the secure the environment and understand the configuration steps.
$tenantID=""
$subscriptionID=""
$resourceGroupName="SecuringAMLSDemo"
$location="westus"
# Connect the Azure Subscription
Connect-AzAccount -Tenant $tenantID -Subscription $subscriptionID
# Create the Resource Group
New-AzResourceGroup -Name $resourceGroupName -Location $location
# Create the virtual Network and Subnet
$vnet = @{
Name = 'myHub'
ResourceGroupName = $resourceGroupName
Location = $location
AddressPrefix = '10.222.0.0/16'
}
$virtualNetwork = New-AzVirtualNetwork @vnet
# Training subnet
$trainingsubnet = @{
Name = 'training'
VirtualNetwork = $virtualNetwork
AddressPrefix = '10.222.0.0/24'
}
$subnetConfig = Add-AzVirtualNetworkSubnetConfig @trainingsubnet
# Scoring subnet
$scoringsubnet = @{
Name = 'scoring'
VirtualNetwork = $virtualNetwork
AddressPrefix = '10.222.1.0/24'
}
$subnetConfig = Add-AzVirtualNetworkSubnetConfig @scoringsubnet
# AzureBastionSubnet
$AzureBastionSubnet = @{
Name = 'AzureBastionSubnet'
VirtualNetwork = $virtualNetwork
AddressPrefix = '10.222.254.0/26'
}
$subnetConfig = Add-AzVirtualNetworkSubnetConfig @AzureBastionSubnet
# GatewaySubnet
$GatewaySubnet = @{
Name = 'GatewaySubnet'
VirtualNetwork = $virtualNetwork
AddressPrefix = '10.222.250.0/24'
}
$subnetConfig = Add-AzVirtualNetworkSubnetConfig @GatewaySubnet
# AzureFirewallSubnet
$AzureFirewallSubnet = @{
Name = 'AzureFirewallSubnet'
VirtualNetwork = $virtualNetwork
AddressPrefix = '10.222.252.0/26'
}
$subnetConfig = Add-AzVirtualNetworkSubnetConfig @AzureFirewallSubnet
# PrivateEndpointSubnet
$PrivateEndpointSubnet = @{
Name = 'PrivateEndpointSubnet'
VirtualNetwork = $virtualNetwork
AddressPrefix = '10.222.2.0/24'
}
$subnetConfig = Add-AzVirtualNetworkSubnetConfig @PrivateEndpointSubnet
$virtualNetwork | Set-AzVirtualNetwork
Below AzResourceGroupDeployment command will provision the below services part of AzureML provisioning.
New-AzResourceGroupDeployment `
-Name "exampledeployment" `
-ResourceGroupName $resourceGroupName `
-TemplateUri "https://raw.githubusercontent.com/Azure/azure-quickstart-templates/master/quickstarts/microsoft.machinelearningservices/machine-learning-workspace-vnet/azuredeploy.json" `
-workspaceName "secureamlsdemo" `
-location $location `
-containerRegistryOption "new" `
-containerRegistrySku "Premium" `
-vnetOption "existing" `
-vnetName "myhub" `
-addressPrefixes "10.222.0.0/16" `
-subnetOption "existing" `
-subnetName "PrivateEndpointSubnet" `
-subnetPrefix "10.222.2.0/24" `
-privateEndpointType "AutoApproval"
We are going to use the private link endpoints to bring the PaaS (platform as a service) services inside the private VNET that we have created in the earlier step.
IMPORTANT
When the ACR is behind a VNET, AzureML uses the compute cluster to build the docker image. So we need to create a compute cluster as it is mentioned for the compute instance in 5.6 section later in this document. Please create the cluster in the training subnet with NO public IP. More details here
Use the below script to set the image_build_compute parameter.
python
from azureml.core import Workspace
# Load workspace from an existing config file
ws = Workspace.from_config()
# Update the workspace to use an existing compute cluster
ws.update(image_build_compute = 'aml-cluster')
# To switch back to using ACR to build (if ACR is not in the VNet):
# ws.update(image_build_compute = '')
cli
az ml workspace update --name secureamlsdemo --resource-group SecuringAMLSDemo --image-build-compute aml-cluster
Since all the resources that are created are under VNET, we won’t be able to access them from our local machine. There are two ways we can access resources. Azure bastion is the quicker way of connecting; however, it is not that cost-efficient as we need to rely on a virtual machine inside the VNET for our development work. Point to Site VPN is another approach that can be implemented to use our own computer as the development environment.
We get the following error message now if we try to access the AzureML workspace from our local machine. This is because the AzureML resources are denied the public internet inbound traffic.
In this step, we are going to create the Virtual machine inside our training VNET and connect the virtual machine using the Azure bastion. Azure bastion provides us with the public IP that becomes the intermediate interface to connect to the Virtual machine.
Name = mybastion
Region = West US
Tier = Standard
Instance Count = 2
Virtual network = myhub
Subnet = AzureBastionSubnet (no other name is allowed here).
Public IP address = create new
Public IP address name = myhub-ip
Public Ip address SKU = standard.
This virtual machine will work as a development machine that can be used to connect to the secure AzureML environment.
Here is the configuration for the Virtual machine. Please note that highlighted section in the screenshot below where we are not allowing the public internet traffic from the internet. We are going to spin up the VM in the training subnet with no public IP.
From the Virtual Machine resource, from the blade select the Bastion option, and provide the username and password. Once we connect to the Virtual Machine, if we do ipconfig, we will be able to see the private IP from the training subnet.
We can access the workspace now from the virtual machine.
We have a detailed step mentioned in the documentation that can be followed to set up the environment
here are some major steps that need to be followed.
10.222.2.4 d6e2c17a-4d2d-42ac-b449-3920810b2775.workspace.westus.api.azureml.ms
10.222.2.4 d6e2c17a-4d2d-42ac-b449-3920810b2775.workspace.westus.cert.api.azureml.ms
10.222.2.5 ml-secureamlsdemo-westus-d6e2c17a-4d2d-42ac-b449-3920810b2775.westus.privatelink.notebooks.azure.net
10.222.2.6 sa5qtd45ryus6lq.blob.core.windows.net
10.222.2.7 sa5qtd45ryus6lq.file.core.windows.net
10.222.2.10 kv5qtd45ryus6lq.vault.azure.net
10.222.2.9 cr5qtd45ryus6lq.azurecr.io
** please update the private Ips and resource name correctly as per your environment.
Before we run an AzureML experiment, we need to create the AzureML Compute Instance. From the AzureML workspace, select the compute option. We need to attach the AzureML compute with the “training” subnet. Also, check the “No Public IP.” No Public IP option is currently in preview. link. In case this option is not available in your region, and if you don’t want to use the preview feature, you can skip this step.
Please go through the later part of the document where we are going to discuss setting up the AzureML compute Instance with public IP.
If you receive the following error while creating the resource, please disable the 2 network policies from the subnet. Detail can be found in this link.
The specified subnet /subscriptions/2e/resourceGroups/SecuringAMLSDemo/providers/Microsoft.Network/virtualNetworks/myHub/subnets/training has PrivateLinkServiceNetworkPolicies or PrivateEndpointNetworkPolicies enabled. Please disable them to provision cluster/instance with no public IP. Please read this document for more details: https://aka.ms/AMLPLNetPolicies
$virtualSubnetName = "training"
$virtualNetwork= Get-AzVirtualNetwork -Name "myhub" -ResourceGroupName "SecuringAMLSDemo"
($virtualNetwork | Select -ExpandProperty subnets | Where-Object {$_.Name -eq $virtualSubnetName} ).privateLinkServiceNetworkPolicies = "Disabled"
($virtualNetwork | Select -ExpandProperty subnets | Where-Object {$_.Name -eq $virtualSubnetName} ).PrivateEndpointNetworkPolicies = "Disabled"
$virtualNetwork | Set-AzVirtualNetwork
$subnets=$virtualNetwork.Subnets
$selectedsubnet = $subnets| where {$_.Name -eq "training"}
$selectedsubnet
As of now, the compute resources have public internet outbound connectivity. We would like to restrict the inbound and outbound traffic to our virtual network.
We will create a firewall, firewall policy, and public IP resource to setup the firewall.
We will create a route table and set a route, so that all the outbound traffic from the virtual network goes via the firewall.
We are now going to set the route. All the traffic should go via the virtual appliance (i.e., Azure firewall)
We are going to map the training subnet now so that the rule is applicable only for the resources that are under training and scoring subnets.
Once the route is enabled, we won’t be able to access any site from the virtual machine or from the AzureML compute instance.
AzureML compute needs some specific application and network rules to be enabled to work. As per the documentation, we are going to create the application rules and network rules in the firewall policy.
Here are the outbound network rules created for the training and scoring subnets as per the documentation
Here are the application rules that are created for the training and scoring subnet as per the documentation.
Destination:
files.pythonhosted.org,mcr.microsoft.com,*.mcr.microsoft.com,graph.windows.net,anaconda.com,*.anaconda.com,*.anaconda.org,pypi.org,cloud.r-project.org,*pytorch.org,*.tensorflow.org,update.code.visualstudio.com,dc.applicationinsights.azure.com,dc.applicationinsights.microsoft.com,dc.services.visualstudio.com
While using the No public option is the best option to secure the AzureML environment, however, the feature is in preview right now link. If the feature is not available in your region, please go ahead with the public IP. We might get the error below while creating the compute resources. To mitigate this, we need to add 2 inbound network rules in the user-defined routes. It is explained here.
Error Message:
The specified Azure ML Compute Instance compute-instance-no-pip encountered an unusable node. Please try to restart the compute instance to recover. If it failed at creation time, please delete and try to recreate the compute instance. If the problem persists, please follow up with Azure Support.
Warning: The following IP ranges or service tags are routed to a NetworkVirtualAppliance or a VirtualNetworkGateway. If the NetworkVirtualAppliance or the VirtualNetworkGateway do not re-route these IP ranges to Internet, that might cause a failure. IP ranges: BatchNodeManagement=[13.86.218.192/27,13.91.55.167/32,13.91.88.93/32,13.91.107.154/32,13.93.206.144/32,40.82.255.64/27,40.112.254.235/32,40.118.208.127/32,104.40.69.159/32,168.62.4.114/32,191.239.18.3/32,191.239.21.73/32,191.239.40.217/32];AzureMachineLearning=[13.86.195.35/32,13.87.160.129/32,40.82.248.80/28,40.112.242.176/28,20.42.0.240/28,40.71.11.64/28,40.78.227.32/28,40.79.154.64/28,52.255.214.109/32,52.255.217.127/32]. For more information about inbound configuration, please refer to https://docs.microsoft.com/azure/machine-learning/how-to-access-azureml-behind-firewall?tabs=ipaddre...
Adding the Inbound connection in the user defined routes for the service tags:
az network route-table route create -g securingamlsdemo --route-table-name routetablesecureamls -n AzureMLRoute --address-prefix AzureMachineLearning --next-hop-type Internet
az network route-table route create -g securingamlsdemo --route-table-name routetablesecureamls -n BatchRoute --address-prefix BatchNodeManagement.westus --next-hop-type Internet
To check if the AzureML environment is working fine or not, let’s run the AzureML Auto ML job.
We can work with the diabetes dataset. Store the data in a container in the storage account that you had created before while setting up the AzureML environment.
Create a dataset in AzureML studio from the diabetes data. And then create an auto ML classification experiment. Set the target column as “Diabetic”.
Hope this helps!
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.