Blog Post

Microsoft Foundry Blog

4 MIN READ

A Solution for ML Pipeline in Multi-tenancy Manner

Microsoft

Apr 26, 2024

Very often, Solution Providers have enterprise scenarios for deploying ML pipelines, where involves multiple tenants and each tenant may have their own Azure subscription.

There are some situations when designing the enterprise solution, for example:

Each tenant may want to keep their own data non-shared.
Each tenant has their own computing and hosting environment for ML training, retraining, and inferencing.
Each tenant may have different needs for retraining, with different scheduling, using different data sets.
Even for the same ML algorithm, each tenant may use different parameters.
The Solution Provider wants to maintain a centralized repository for all tenants’ data, models, environments, components, etc.
In addition, the Solution Provider wants flexibility to manage all tenants, when and how they want the tenants to share data, models, environments, pipeline environments. Each tenant has their flexibility to share those with other tenants as well.

We provide this multi-tenancy solution for Solution Providers to deploy and manage the ML pipelines across multiple workspaces, where each workspace may belong to a different Azure subscription.

The key element here is Azure Machine Learning (AzureML) Registries. It acts as a middleman for tenants to share data/models/environments/components, with version control. When creating an AzureML registry, it is essential to make it available for multiple regions where the tenants reside in. The tenants outside of the Primary and Additional regions covered by the registry are not able to share data/models/environments/components with the registry. Besides that, the registry needs to add the workspaces as user; in workspaces, need to assign certain role (i.e. contributor) to the ML registry owner at subscription level and workspace level.

The solution works in this way: Each tenant has their own workspace using their own subscription; within the workspace, the tenant is self-sufficient for computing and storage resources, it can build ML pipeline using its own data/models/environments/components. If the tenant wants to share data/models/environments/components with another tenant, it shares to the registry first, which we call ‘share’ in the picture below; then through the registry to share with other tenants, which we call ‘push’ in the picture below. Tenants can also get data/models/environments/components from registry, which we call ‘pull’ as shown in the picture below.

Share – from tenant (workspace) sends to registry.
Push – from registry sends to tenant (workspace).
Pull – tenant (workspace) gets from registry.

This kind of solution design can satisfy multiple scenarios of model sharing:

Each tenant can have their specific models without sharing with others.
All tenants can share their models if they want.
All tenants can pull the shared models and retrain or fine-tune, then share back the retrained or fine-tuned models.

Similar scenarios apply to data, environments, and components.

Notes:

If a tenant has multiple subscriptions, then the sharing is done at subscription level then workspace level.
This solution doesn’t apply if the sharing has to be done above subscription level. That means, if the ML registry can’t access the tenant’s subscription directly, then the ML registry can’t share/push entities to that tenant.

Below is an example for model sharing in four workspaces, where three of them belong to one subscription and the other one belongs to a different subscription.

In this example, workspace1 shares its model credit-card-default to registry, then the registry pushes to workspace3; workspace3 shares its model bert-case-uncased_fine-tuned to registry, then the registry pushes to workspace1, workspace2 and workspace4. The registry has both models.

For data, models, environments, components, the way of share/push/pull between tenant and registry are a little different. For ‘share/push’, it can be done in AzureML UI and SDK; for ‘pull’, it can be done in SDK.

Below is one example for ‘share’ model from workspace1 to registry using AzureML UI.

Below are some examples of SDK to get (‘pull’) data/model/environment/component from registry:

Then further register into the workspace, below is an example for environment:

The cool thing about this solution is that, once a tenant gets data/model/environment/component from registry, where originally shared by another tenant, it can build its own ML pipeline, run the pipeline using its own resources. Besides the way described above using SDK, the tenant can retrieve those from AzureML Designer. By filtering the registry creator, the workspace can see the data, model and component, and then create pipeline on the canvas.

Below is one example in workspace4, it gets model from registry where originally from workspace3, it uses its own data set, AzureML pre-built component, and creates a fine-tuning pipeline. After it runs the pipeline, fine-tunes the model, it can share a new version of fine-tuned model back to registry. Remember, workspace4 belongs to a different subscription from workspace3, it’s really cool!

The way registry ‘share/push’ model to a workspace is that it deploys a real-time or batch endpoint to the workspace. The workspace then can do inference using the endpoint. Below is an example, workspace4 gets the model from registry where originated from workspace3, then performs a test using input data.

If the workspace deployed an endpoint from the shared model, also ‘get/pull’ the model from the registry, it can add deployment to the endpoint, as shown below.

References:

Acknowledgement:

Thanks Daniel Scott-Raynsford and Facundo Santiago for encouraging me to write this article. We are glad to share this solution implementation broadly to help our customers.

Reviewers:

Daniel Scott-Raynsford, Takuto Higuchi, Alex Zeltov

Updated Apr 27, 2024

Version 6.0

azure machine learning

classical machine learning

machine learning

Helen_Zeng

Microsoft

Joined September 20, 2023

View Profile

Microsoft Foundry Blog

Follow this blog board to get notified when there's new activity

5 Comments

pat-h-bsl
Copper Contributor
May 06, 2024
Sure Helen_Zeng a chat sounds good
Helen_Zeng
Microsoft
May 02, 2024
Hi pat-h-bsl Thanks for your comments. I understand your inquiry, there are notes after the second figure, "This solution doesn’t apply if the sharing has to be done above subscription level. That means, if the ML registry can’t access the tenant’s subscription directly, then the ML registry can’t share/push entities to that tenant.". Can we setup a meeting to talk about your scenario?
pat-h-bsl
Copper Contributor
May 01, 2024
Thanks, but can you please clarify what you mean by the term "tenant"/"tenancy"? In the language of Azure, I understand "tenant" to have a very specific meaning - the uppermost level of the Azure Management Group hierarchy, under which a tenant can have multiple Azure Subscriptions, which can each have multiple Resource Groups etc. I.e. the kind of "tenant" referenced by this doc: https://learn.microsoft.com/en-us/azure/azure-portal/get-subscription-tenant-id
It's not clear to me whether this is the meaning of your term "tenant" or whether you have a more informal meaning along the lines of "different groups of ML practitioners working loosely together within an organisation".
I landed on this page because the specific problem I am trying to solve is:
Our company has a global federated structure, where we work with different business units that act as their own companies, where each have their own Azure tenancy (in the sense of Microsoft Entra Tenancy I referenced above)
Our ML project has environments (Azure Resource Groups) split across two of these Azure tenancies (Microsoft Entra tenancies). I.e.
Dev environment: Az Tenant 1 -> Az Subscription 1 -> Az Resource Group 1 -> Az ML workspace 1
Prod environment: Az Tenant 2 -> Az Subscription 2 -> Az Resource Group 2 -> Az ML workspace 2
I want to use an AzureML Registry to share models and environments between Dev & Prod
Is it possible in this scenario using standard Azure AD authentication? Or is the AzML registry not set up for cross-sharing between different Microsoft Entra Tenancies?
manojkadam1000
Copper Contributor
Apr 28, 2024
The multi-tenancy solution for deploying ML pipelines across multiple workspaces, each potentially belonging to different Azure subscriptions, involves leveraging Azure Machine Learning (AzureML) Registries as a central hub for sharing data, models, environments, and components among tenants while maintaining version control.
In this setup, each tenant operates within their own workspace using their subscription, ensuring autonomy over computing and storage resources. If a tenant wishes to share resources with another tenant, they first share them with the AzureML registry ("share"). The registry then facilitates the distribution of these resources to other tenants ("push"). Tenants can also retrieve shared resources from the registry as needed ("pull").
This approach accommodates various scenarios:
1. Tenants can keep their models exclusive without sharing.
2. All tenants have the option to share their models.
3. Tenants can pull shared models for retraining or fine-tuning, then share back the updated models.
Sharing, pushing, and pulling of data, models, environments, and components between tenants and the registry can be managed through the AzureML UI and SDK. For example, sharing a model from a workspace to the registry can be done through the AzureML UI, while pulling resources from the registry can be achieved via SDK.
The flexibility of this solution enables tenants to build their ML pipelines using shared resources while utilizing their own resources for execution. Additionally, tenants can access shared resources through AzureML Designer, allowing them to filter and select resources based on the registry creator and incorporate them into their pipelines seamlessly.
Furthermore, tenants can deploy shared models as real-time or batch endpoints in their workspaces for inference tasks. This deployment process involves retrieving the model from the registry and then initiating inference using the deployed endpoint.
Overall, this multi-tenancy solution empowers Solution Providers to efficiently manage ML pipelines across diverse tenant environments, ensuring flexibility, autonomy, and secure resource sharing.
azeltov
Former Employee
Apr 26, 2024
Good job Helen_Zeng !