As enterprises adopt generative AI at scale, designing a platform that balances shared services, tenant isolation, and cost transparency becomes critical. In many real-world scenarios, organizations need a standalone AI platform that delivers immediate value without relying on a fully established enterprise landing zone. This blog draws from experience designing an Azure AI Hub and Spoke architecture for a multi-tenant enterprise customer. The solution uses a shared AI Hub to host common capabilities—Application Gateway, Azure Firewall, API Management, shared AKS, Azure AI Search, and LLM models via Azure AI Foundry—while serving dedicated customers through isolated AI Spokes. Customer-specific application logic is deployed on AKS and consumes shared AI services securely, enabling reuse, isolation, and usage-based chargeback without duplicating infrastructure. The AI Hub itself contains the required network and AI components, allowing the platform to operate independently of a traditional landing zone. The sections below outline the key architectural choices, isolation approach, and cost considerations involved in building a scalable, enterprise-grade Azure AI platform.
A large enterprise customer adopting AI at scale typically needs three non‑negotiables in its AI foundation:
- End‑to‑end tenant isolation across network, identity, compute, and data
- Secure, governed traffic flow from users to AI services
- Transparent chargeback/showback for shared AI and platform services
At the same time, the platform must enable rapid onboarding of new tenants or applications and scale cleanly from proof‑of‑concept to production.
This article proposes an Azure Landing Zone–aligned architecture using a Hub‑and‑Spoke model, where:
- The AI Hub centralizes shared services and governance
- AI Spokes host tenant‑dedicated AI resources
- Application logic and AI agents run on AKS
The result is a secure, scalable, and operationally efficient enterprise AI foundation.
1. Architecture goals & design principles
Goals
- Host application logic and AI agents on Azure Kubernetes Service (AKS) as custom deployments instead of using agents under Azure AI Foundry
- Enforce strong tenant isolation across all layers
- Support cross chargeback and cost attribution
- Adopt a Hub‑and‑Spoke model with clear separation of shared vs. tenant‑specific services
Design principles (Azure Landing Zone aligned)
Azure Landing Zone (ALZ) guidance emphasizes:
- Separation of platform and workload subscriptions
- Management groups and policy inheritance
- Centralized connectivity using hub‑and‑spoke networking
- Policy‑driven governance and automation
For infrastructure as code, ALZ‑aligned deployments typically use Bicep or Terraform, increasingly leveraging Azure Verified Modules (AVM) for consistency and long‑term maintainability.
2. Subscription & management group model
A practical enterprise layout looks like this:
- Tenant Root Management Group
o Platform Management Group
- Connectivity subscription (Hub VNet, Firewall, DNS, ExpressRoute/VPN)
- Management subscription (Log Analytics, Monitor)
- Security subscription (Defender for Cloud, Sentinel if required)
o AI Hub Management Group
- AI Hub subscription (shared AI and governance services)
o AI Spokes Management Group
- One subscription per tenant, business unit, or regulated boundary
This structure supports enterprise‑scale governance while allowing teams to operate independently within well‑defined guardrails.
3. Logical architecture — AI Hub vs. AI Spoke
AI Hub (central/shared services)
The AI Hub acts as the governed control plane for AI consumption:
- Ingress & edge security: Azure Application Gateway with WAF (or Front Door for global scenarios)
- Central egress control: Azure Firewall with forced tunneling
- API governance: Azure API Management (private/internal mode)
- Shared AI services: Azure OpenAI (shared deployments where appropriate), safety controls
- Monitoring & observability: Azure Monitor, Log Analytics, centralized dashboards
- Governance: Azure Policy, RBAC, naming and tagging standards
All tenant traffic enters through the hub, ensuring consistent enforcement of security, identity, and usage policies.
AI Spoke (tenant‑dedicated services)
Each AI Spoke provides a tenant‑isolated data and execution plane:
- Tenant‑dedicated storage accounts and databases
- Vector stores and retrieval systems (Azure AI Search with isolated indexes or services)
- AKS runtime for tenant‑specific AI agents and backend services
- Tenant‑scoped keys, secrets, and identities
4. Logical architecture diagram (Hub vs. Spoke)
5. Network architecture — Hub and Spoke
6. Tenant onboarding & isolation strategy
Tenant onboarding flow
Tenant onboarding is automated using a landing zone vending model:
- Request new tenant or application
- Provision a spoke subscription and baseline policies
- Deploy spoke VNet and peer to hub
- Configure private DNS and firewall routes
- Deploy AKS tenancy and data services
- Register identities and API subscriptions
- Enable monitoring and cost attribution
This approach enables consistent, repeatable onboarding with minimal manual effort.
Isolation by design
- Network: Dedicated VNets, private endpoints, no public AI endpoints
- Identity: Microsoft Entra ID with tenant‑aware claims and conditional access
- Compute: AKS isolation using namespaces, node pools, or dedicated clusters
- Data: Per‑tenant storage, databases, and vector indexes
7. Identity & access management (Microsoft Entra ID)
Key IAM practices include:
- Central Microsoft Entra ID tenant for authentication and authorization
- Application and workload identities using managed identities
- Tenant context enforced at API Management and propagated downstream
- Conditional Access and least‑privilege RBAC
This ensures zero‑trust access while supporting both internal and partner scenarios.
8. Secure traffic flow (end‑to‑end)
- User accesses application via Application Gateway + WAF
- Traffic inspected and routed through Azure Firewall
- API Management validates identity, quotas, and tenant context
- AKS workloads invoke AI services over Private Link
- Responses return through the same governed path
This pattern provides full auditability, threat protection, and policy enforcement.
9. AKS multitenancy options
|
Model |
When to use |
Characteristics |
|
Namespace per tenant |
Default |
Cost‑efficient, logical isolation |
|
Dedicated node pools |
Medium isolation |
Reduced noisy‑neighbor risk |
|
Dedicated AKS cluster |
High compliance |
Maximum isolation, higher cost |
Enterprises typically adopt a tiered approach, choosing the isolation level per tenant based on regulatory and risk requirements.
10. Cost management & chargeback model
Tagging strategy (mandatory)
- tenantId
- costCenter
- application
- environment
- owner
Enforced via Azure Policy across all subscriptions.
Chargeback approach
- Dedicated spoke resources: Direct attribution via subscription and tags
- Shared hub resources: Allocated using usage telemetry
o API calls and token usage from API Management
o CPU/memory usage from AKS namespaces
Cost data is exported to Azure Cost Management and visualized using Power BI to support showback and chargeback.
11. Security controls checklist
- Private endpoints for AI services, storage, and search
- No public network access for sensitive services
- Azure Firewall for centralized egress and inspection
- WAF for OWASP protection
- Azure Policy for governance and compliance
12. Deployment & automation
- Foundation: Azure Landing Zone accelerators (Bicep or Terraform)
- Workloads: Modular IaC for hub and spokes
- AKS apps: GitOps (Flux or Argo CD)
- Observability: Policy‑driven diagnostics and centralized logging
13. Final thoughts
This Azure AI Landing Zone design provides a repeatable, secure, and enterprise‑ready foundation for any large customer adopting AI at scale.
By combining:
- Hub‑and‑Spoke networking
- AKS‑based AI agents
- Strong tenant isolation
- FinOps‑ready chargeback
- Azure Landing Zone best practices
organizations can confidently move AI workloads from experimentation to production—without sacrificing security, governance, or cost transparency.
Disclaimer:
While the above article discusses hosting custom agents on AKS alongside customer-developed application logic, the following sections focus on a baseline deployment model with no customizations. This approach uses Azure AI Foundry, where models and agents are fully managed by Azure, with centrally governed LLMs(AI Hub) hosted in Azure AI Foundry and agents deployed in a spoke environment.
🚀 Get Started: Building a Secure & Scalable Azure AI Platform
To help you accelerate your Azure AI journey, Microsoft and the community provide several reference architectures, solution accelerators, and best-practice guides. Together, these form a strong foundation for designing secure, governed, and cost-efficient GenAI and AI workloads at scale.
Below is a recommended starting path.
1️⃣ AI Landing Zone (Foundation)
Purpose: Establish a secure, enterprise-ready foundation for AI workloads.
The AI Landing Zone extends the standard Azure Landing Zone with AI-specific considerations such as:
- Network isolation and hub-spoke design
- Identity and access control for AI services
- Secure connectivity to data sources
- Alignment with enterprise governance and compliance
🔗 AI Landing Zone (GitHub):
https://github.com/Azure/AI-Landing-Zones?tab=readme-ov-file
👉 Start here if you want a standardized baseline before onboarding any AI workloads.
2️⃣ AI Hub Gateway – Solution Accelerator
Purpose: Centralize and control access to AI services across multiple teams or customers.
The AI Hub Gateway Solution Accelerator helps you:
- Expose AI capabilities (models, agents, APIs) via a centralized gateway
- Apply consistent security, routing, and traffic controls
- Support both Chat UI and API-based consumption
- Enable multi-team or multi-tenant AI usage patterns
🔗 AI Hub Gateway Solution Accelerator:
https://github.com/mohamedsaif/ai-hub-gateway-landing-zone?tab=readme-ov-file
👉 Ideal when you want a shared AI platform with controlled access and visibility.
3️⃣ Citadel Governance Hub (Advanced Governance)
Purpose: Enforce strong governance, compliance, and guardrails for AI usage.
The Citadel Governance Hub builds on top of the AI Hub Gateway and focuses on:
- Policy enforcement for AI usage
- Centralized governance controls
- Secure onboarding of teams and workloads
- Alignment with enterprise risk and compliance requirements
🔗 Citadel Governance Hub (README):
https://github.com/Azure-Samples/ai-hub-gateway-solution-accelerator/blob/citadel-v1/README.md
👉 Recommended for regulated environments or large enterprises with strict governance needs.
4️⃣ AKS Cost Analysis (Operational Excellence)
Purpose: Understand and optimize the cost of running AI workloads on AKS.
AI platforms often rely on AKS for agents, inference services, and gateways. This guide explains:
- How AKS costs are calculated
- How to analyze node, pod, and workload costs
- Techniques to optimize cluster spend
🔗 AKS Cost Analysis:
https://learn.microsoft.com/en-us/azure/aks/cost-analysis
👉 Use this early to avoid unexpected cost overruns as AI usage scales.
5️⃣ AKS Multi-Tenancy & Cluster Isolation
Purpose: Safely run workloads for multiple teams or customers on AKS.
This guidance covers:
- Namespace vs cluster isolation strategies
- Security and blast-radius considerations
- When to use shared clusters vs dedicated clusters
- Best practices for multi-tenant AKS platforms
🔗 AKS Multi-Tenancy & Cluster Isolation:
https://learn.microsoft.com/en-us/azure/aks/operator-best-practices-cluster-isolation
👉 Critical reading if your AI platform supports multiple teams, business units, or customers.
🧭 Suggested Learning Path
If you’re new, follow this order:
- AI Landing Zone → build the foundation
- AI Hub Gateway → centralize AI access
- Citadel Governance Hub → enforce guardrails
- AKS Cost Analysis → control spend
- AKS Multi-Tenancy → scale securely