azure
8073 TopicsBest practices for Infrastructure as Code CI/CD on Azure
Hello Folks! If your IaC repo has a dev folder, a test folder, and a prod folder that all started out identical and have since drifted in three different directions, this session is for you. At the Microsoft Azure Infrastructure Summit 2026, Jack Tracey and Jared Holgate (the team behind Azure Landing Zones and Azure Verified Modules) laid out, in plain language, how to ship Infrastructure as Code on Azure without leaking secrets, blowing up production, or duplicating thousands of lines of module code across folders. Here are the bits that matter most for IT Pros and platform engineers. 📺 Watch the session: Why IT Pros Should Care You are the one paged at 2am when a pipeline rolls out a broken NSG rule. You are the one carrying the cert that the deploy service principal still uses. You are the one explaining to audit why the prod plan and the prod apply ran with the same Owner-scoped identity. So this session is squarely in your lane. It covers: Why hand-rolled modules are slowly becoming an anti-pattern on Azure. A repo layout that scales to dozens of environments without copy-paste. How to get rid of static client secrets and federated cert auth, for good. Where approvals actually need to live in GitHub vs. Azure DevOps so they cannot be bypassed. The three-layer Terraform state model that Microsoft uses inside Azure Landing Zones. In short, this is the practitioner version of “do IaC properly,” from the people who write the platform code Microsoft ships. The IaC CI/CD problem Jack opened with a slide that gets a knowing laugh from anyone who has been doing this for more than a year. You start with one repo, one Bicep file, one happy team. Eighteen months later, you have a landingzone-prod-v2-final-USE-THIS-ONE folder, a service principal whose secret expired two days ago, and a pipeline nobody dares touch. The drivers of that pain are consistent: Modules written from scratch, never tested the same way twice. Per-environment folders that diverge silently over time. Long-lived secrets and certificates sitting in pipeline variables. One identity doing both plan and apply, with Owner on the management group. No approvals, or approvals in the wrong place. No tests until the deploy fails in prod. The good news is none of these problems are new, and the patterns to fix them are well understood. The session walks through them in the order you would actually adopt them. Patterns that work in production 1. Don’t write modules. Consume Azure Verified Modules. This is best practice number one, and Jack and Jared spent a full chapter on it for a reason. Azure Verified Modules (AVM) is the official Microsoft initiative that consolidates IaC modules for Azure into a single, supported, Well-Architected-aligned library, available in both Bicep and Terraform. The Bicep versions live in the Public Bicep Registry under the avm/ namespace. The Terraform versions live on the HashiCorp Terraform Registry under Azure/avm-*. What you get for free when you consume an AVM module: Defaults that line up with the Well-Architected Framework (RBAC over access policies, TLS 1.2, private endpoint support out of the box). Semantic versioning so you can pin and review the diff before upgrading. Deployment tests on every module, run by the AVM team. A real Microsoft support path, not a random GitHub issue. A great backchannel question came up about brownfield. Jared’s answer: AVM is just standard IaC, no special tooling. In Bicep, brownfield adoption is straightforward because there is no state. In Terraform, the new import blocks make it less painful than it used to be. 2. One folder, one source of truth Repo layout is where most teams go wrong, and the fix is simple. You should have one set of module code, and per-environment differences should be expressed as data, not as duplicated code. In Bicep, that means a single main.bicep and one .bicepparam file per environment. In Terraform, the same main.tf with one .tfvars file per environment. If you find yourself copying a module folder to dev, test, and prod, stop. Within six months those three folders will not look the same, and at that point you no longer have IaC, you have three handwritten environments that happen to be checked into Git. 3. Kill static secrets. Use Workload Identity Federation. This was the chat highlight. The question came in: “So in short, replace all service principals with credential secrets with user-assigned managed identity?” Jack and Jared both replied within seconds: yes, 10 points to you. Workload Identity Federation (OIDC) lets your GitHub Actions or Azure DevOps pipeline exchange a short-lived token from its own OIDC provider for a Microsoft Entra ID token. No client secrets, no certs to rotate, no Key Vault dance to retrieve them. A couple of things to know: Subject claim format differs by platform. GitHub uses repo:org/repo:environment:prod style claims; Azure DevOps uses sc://org/project/connection. Pick the right one or auth silently fails. Use a user-assigned managed identity as the target. It survives the pipeline being deleted and gives you one place to manage role assignments. The Azure Bicep Deploy GitHub Action and the official AzureRM / AzAPI Terraform providers all support OIDC natively. 4. Split plan from apply Even with OIDC, a single Owner-scoped identity that does both terraform plan and terraform apply is a problem. Plan needs Reader (and a few read-data permissions). Apply needs Contributor or Owner depending on what you deploy. Split them into two identities, federated to two different stages of your pipeline, and you have a real least-privilege story to take to your security team. Securing the pipeline Auth is half the story. The other half is making sure only the right pipelines, with the right approvals, can use those identities at all. Governed templates. Keep reusable pipeline templates in a separate, locked-down repo. Pin federated credentials or service connections to those templates via the job_workflow_ref claim on GitHub or required template checks on Azure DevOps. If someone forks the workflow, the OIDC exchange refuses to issue a token. Approvals in the right place. On GitHub, use Environments and require reviewers on prod. On Azure DevOps, put the approval on the Service Connection, not the Environment. The Environment approval can be bypassed by a clever YAML author. The Service Connection approval cannot. Shift left, hard. Pre-commit hooks for bicep format and terraform fmt, lint on every PR, GitHub Advanced Security for secret and code scanning, automated tests on PRs, and ephemeral test environments spun up per PR and torn down at the end. One attendee mentioned using Pester for end-to-end infra tests against a sandbox sub. That is exactly the pattern. Three-layer state. For Terraform on Azure Landing Zones, the recommended split is: platform landing zone (one state), application landing zone / subscription vending (one state per landing zone), application workload (one state per workload). Never collapse all subs into one state file. You will regret it the first time someone runs apply at the wrong time. Getting Started You do not have to do all of this at once. Pick the highest-pain item first. Still using client secrets in pipelines? Fix that this sprint. Wire up OIDC and a user-assigned managed identity. Drifting per-environment folders? Consolidate to one module plus per-env param files. Writing your own storage account module for the fifth time? Try the matching AVM module from the registry. Put approvals on the Service Connection (ADO) or Environment (GitHub) for prod. Add linting and pre-commit hooks. Split plan and apply identities. Layer your Terraform state. It is a roadmap, not a weekend project. Every step pays back the moment you take it. Resources Azure Verified Modules portal. the official AVM home, with module indexes for Bicep and Terraform, specs, and FAQ. Azure Verified Modules on GitHub. the tracking repo and source of truth for module proposals. Bicep on Microsoft Learn. official language docs, deployment guidance, and references for the public registry. Azure Bicep Deploy GitHub Action. the OIDC-friendly action for deploying Bicep from GitHub Actions. GitHub Actions for Azure on Microsoft Learn. Workload Identity Federation setup for GitHub Actions targeting Azure. Configuring OpenID Connect in Azure (GitHub Docs). the canonical OIDC subject claims and federated credential walkthrough for GitHub. Azure Pipelines documentation. service connections, approvals and checks, required templates, and YAML reference. Watch the rest of the Summit This session was one of many at the Microsoft Azure Infrastructure Summit 2026. If you want the keynotes, the Bicep deep dives, the AKS sessions, and the storage track, the full playlist is here: Microsoft Azure Infra Summit 2026 playlist Cheers! Pierre Roman82Views0likes0CommentsBuilding Secure, Well-Architected Azure Workloads with Azure Verified Modules and GitHub Copilot
Hello Folks! If you have been writing Bicep or Terraform for Azure over the last few years, you have probably lived this story. You pick a community module, it works great for six months, then the maintainer moves on, issues stop getting answered, and you are stuck owning code you never wrote. At the Microsoft Azure Infra Summit 2026, Jack Tracy and Jarrod Holgate (tech leads on the Azure Verified Modules project) walked us through how AVM solves that, and how pairing it with GitHub Copilot and Spec Kit changes the way IT pros build Azure workloads. 📺 Watch the session: Why IT Pros Should Care This is not a developer-only topic. If you are the person responsible for landing zones, platform engineering, or the IaC pipelines that other teams ship through, this hits you directly. You stop owning home-grown storage account and VNet modules that no two teams write the same way. You get secure-by-default resources without having to draft a 40-page internal coding standard. You can let application teams move fast without sacrificing the Well-Architected Framework guardrails you care about. You get a supported, Microsoft-backed module library with a clear lifecycle, instead of betting on an abandoned repo. You finally have a deterministic way to put AI to work on infrastructure code without it inventing things you do not want in production. If any of that sounds like a Tuesday for you, this session is worth 40 minutes. What are Azure Verified Modules Azure Verified Modules (AVM) is the official Microsoft infrastructure-as-code module library for both Bicep and Terraform. Jack put it plainly in the session: AVM is the one-time solution that is not going to go away, with ownership, a defined lifecycle, structure, and well-defined specifications. Here is what makes AVM different from the previous landscape of community repos: It is supported in multiple IaC languages today (Bicep and Terraform), with consistent specifications across both. Modules are aligned to the Azure Well-Architected Framework by default. Zone redundancy on, public IPs off, sensible TLS minimums, right out of the box. Everything is still flexible, you can override any of it via a parameter or variable. It is open source. People inside and outside Microsoft can contribute and maintain modules. It consolidates the older CARML and Terraform Verified Modules efforts under one roof, owned by Microsoft FTEs and backed by the AVM core team. AVM has three module classifications, and understanding them is half the battle: Resource modules. A one-to-one mapping to a single resource type, like a storage account or a virtual network. Need ten of them, loop the module ten times. Pattern modules. A collection of resources, usually built on top of resource modules, that delivers a bigger slice of an architecture. The Azure Landing Zone is roughly five pattern modules behind the scenes. Utility modules. Helpers you probably never call directly, but that the library uses for things like region lookups, SKU availability, and naming standards. One thing that gets undersold: AVM is not just for you. The Azure Developer CLI templates use it. Azure Landing Zone and Sovereign Landing Zone are built on it. Internal Microsoft service teams use it. When you adopt AVM, you are using the same building blocks Microsoft uses. Pairing AVM with GitHub Copilot This is where the session gets interesting. AVM gives you the trusted Lego bricks. GitHub Copilot gives you a coding assistant. The problem, as Jack called out, is that AI is non-deterministic by default. It is great at solving ambiguous problems, but you cannot just point it at a blank repo and trust it to stamp out production infrastructure. That is the gap spec-driven development is designed to fill. Spec-driven development is a documentation-first approach. Instead of telling Copilot “write me a Terraform module for a hub-spoke network,” you write a structured specification up front that captures intent, quality bar, security requirements, and coding standards. The AI then uses that spec as the contract, generates code, validates against it, and loops until the output matches what you asked for. Jarrod walked through Spec Kit, the open source toolkit maintained by GitHub and Microsoft, which formalizes this into eight steps: Constitution. The non-negotiables. “We must use AVM. We must comply with PCI. Optimize for cost.” This is your project DNA. Specify. What you actually want to build, focused on user goals and outcomes, not implementation details. Clarify. Copilot scans the spec, finds ambiguities, and asks you targeted questions (IP ranges, bastion SKUs, anything that is fuzzy). Plan. A technical plan that maps the spec to your standards and constraints. Checklist. A quality checklist the agent uses later to validate its own work. Tasks. The plan broken down into small, reviewable steps. Analyze. A consolidated report across the spec, plan, and tasks so you can sanity check the whole package. Implement. Copilot finally writes the code, validating against everything above as it goes. The critical detail: at every one of those gates, you review. You are still the human in the loop. The AI is not flying solo, and you are not signing off on a thousand-line code dump. When you wire AVM into the constitution (“use AVM modules wherever possible”), Copilot stops trying to hand-roll raw resource declarations. It composes solutions out of trusted, tested, WAF-aligned modules. That is what makes the combination so powerful. Spec Kit is not the only option. Jack mentioned two others worth knowing about: OpenSpec. Leaner than Spec Kit, brownfield-first, aimed at smaller experienced teams. Squad. A completely different model built by a Microsoft team. No specs. Instead, a virtual team of agent personas (IaC specialist, UX, deployment, an orchestrator called Ralph) that collaborate to deliver work. Worth a look if your style is more agent-team than document-first. Real-world value So what does this actually buy you when Monday morning hits? Speed without sacrificing the bar. Application teams stop writing storage account boilerplate. They focus on what the workload needs to do, and the AVM modules handle the resilient, compliant defaults. Compliance becomes additive, not a rewrite. If you need to add HIPAA or NIST compliance later, you add another spec on top of your existing constitution and iterate. You do not throw out your modules. Less ambiguity loop, fewer tokens burned. A good spec up front means fewer Copilot iterations. You get to a working answer faster, with less back and forth. Trust in the AI output. Because AVM modules are tested, supported, and WAF-aligned, what Copilot stitches together is built on solid foundations. You can review the spec instead of every line of Terraform. Your developers shift up the stack. They stop writing IaC primitives and start designing architectures and requirements. That is where the business value lives anyway. A note on tradeoffs. AVM modules are intentionally generic and flexible, so you sometimes get parameters you do not need, and the well-architected defaults can be opinionated for your scenario. The fix is simple, override the parameter. You are trading some control for a lot of consistency, and for most teams that trade is the right one. Getting Started If you want to try this for yourself, here is the path I would take: Go to aka.ms/AVM and bookmark it. Everything starts there. Browse the Bicep and Terraform module indexes. Find the resource you would normally hand-write and try the AVM version in a dev subscription. Read the AVM specifications so you understand the contract every module follows. It makes the parameter sets a lot less surprising. Install Spec Kit via the Specify CLI (the GitHub repo has the instructions) and try the AVM example under the experimental “AI-Assisted Solution Development” section on the AVM site. Run the eight-step Spec Kit flow against a small workload. Do not start with your production landing zone. Pick something contained, like a single app with a web tier, a database, and a Key Vault. Keep the human in the loop. Review every spec gate. That is where the quality comes from. Resources Azure Verified Modules portal (aka.ms/AVM) Azure Verified Modules on GitHub Azure Verified Modules on Microsoft Learn GitHub Spec Kit Spec-driven development with AI (GitHub Blog) Implement spec-driven development with Spec Kit (Microsoft Learn) GitHub Copilot Azure Well-Architected Framework Watch the rest of the Summit If you found this useful, there is a lot more where it came from. The Microsoft Azure Infra Summit 2026 playlist covers landing zones, deployment stacks, AKS networking, storage, and the AI side of platform operations. Block out an afternoon and binge it. Microsoft Azure Infra Summit 2026 on YouTube Cheers! Pierre Roman120Views1like0CommentsAzure Retirement Livestream - Please register! Session 2 - Tracking ID: XTKT-BW8
Join our upcoming live webcast for a transparent discussion about this upcoming Azure retirement — led by our engineering teams. General Purpose v1 (GPv1) Storage Accounts Tracking ID: XTKT-BW8 | Retirement Date: 13 October 2026 Same content presented in both sessions — pick the one that works best for your timezone! What to expect 📚 Understand What will happen, the timelines for the change, and how you can manage it 💬 Ask Live Q&A with our engineering experts throughout the session 🛠Learn How to manage the change smoothly Choose your session Same content presented at both times — pick the one that works best for your timezone: Session 1 14:30 UTC Thursday, 25 June 2026 Register now → Session 2 04:30 UTC Friday, 26 June 2026 Register now → 8:30 AM US Pacific (PDT) 11:30 AM US Eastern (EDT) 4:30 PM London (BST) 12:30 AM +1 Beijing (CST) 3:30 AM +1 Sydney (AEDT) 5:30 AM +1 Auckland (NZDT) 8:30 PM -1 US Pacific (PDT) 11:30 PM US Eastern (EDT) 4:30 AM London (BST) 12:30 PM Beijing (CST) 3:30 PM Sydney (AEDT) 5:30 PM Auckland (NZDT) Our engineering leaders George Trossell Senior Product Manager Azure Networking LinkedIn ↗ ⚠️ Prepare before the livestream Read the Post Incident Review (PIR) ahead of time so you can ask any follow up questions during the live Q&A Helpful resources 🔔 Azure Service Health Alerts Get alerts for relevant incidents by setting up notifications via email, SMS, or webhook 🎥 Past Retrospective Recordings Watch recordings of previous retrospective livestreams 📄 Azure Post Incident Reviews Learn more about PIRs and the retrospective program25Views0likes0CommentsAzure Retirement Livestream - Please register! Session 1- Tracking ID: XTKT-BW8
Join our upcoming live webcast for a transparent discussion about this upcoming Azure retirement — led by our engineering teams. General Purpose v1 (GPv1) Storage Accounts Tracking ID: XTKT-BW8 | Retirement Date: 13 October 2026 Same content presented in both sessions — pick the one that works best for your timezone! What to expect 📚 Understand What will happen, the timelines for the change, and how you can manage it 💬 Ask Live Q&A with our engineering experts throughout the session 🛠Learn How to manage the change smoothly Choose your session Same content presented at both times — pick the one that works best for your timezone: Session 1 14:30 UTC Thursday, 25 June 2026 Register now → Session 2 04:30 UTC Friday, 26 June 2026 Register now → 8:30 AM US Pacific (PDT) 11:30 AM US Eastern (EDT) 4:30 PM London (BST) 12:30 AM +1 Beijing (CST) 3:30 AM +1 Sydney (AEDT) 5:30 AM +1 Auckland (NZDT) 8:30 PM -1 US Pacific (PDT) 11:30 PM US Eastern (EDT) 4:30 AM London (BST) 12:30 PM Beijing (CST) 3:30 PM Sydney (AEDT) 5:30 PM Auckland (NZDT) Our engineering leaders George Trossell Senior Product Manager Azure Networking LinkedIn ↗ ⚠️ Prepare before the livestream Read the Post Incident Review (PIR) ahead of time so you can ask any follow up questions during the live Q&A Helpful resources 🔔 Azure Service Health Alerts Get alerts for relevant incidents by setting up notifications via email, SMS, or webhook 🎥 Past Retrospective Recordings Watch recordings of previous retrospective livestreams 📄 Azure Post Incident Reviews Learn more about PIRs and the retrospective program46Views0likes0CommentsMCP Server Authorization with Azure API Management: From Simple to Advanced
Why put API Management in front of your MCP servers The Model Context Protocol (MCP) has quickly become the standard way for AI agents, such as GitHub Copilot in VS Code, to reach external tools and data. As soon as an MCP server does anything meaningful, the same questions that govern any API resurface: who is allowed to call it, what are they allowed to do, and how do you enforce that consistently across many servers without rewriting each one. Azure API Management (APIM) answers those questions for MCP. It sits between the MCP client and the tool backend and applies the controls you already trust for REST APIs: identity validation, OAuth, rate limiting, IP filtering, and observability. Crucially, APIM speaks the MCP authorization specification, which is built on OAuth 2.1 and Protected Resource Metadata (PRM, RFC 9728). That means APIM can do more than block bad requests. It can actively drive an interactive sign-in from the IDE, so the user logs in with their own identity and the agent acts on their behalf. This article walks through a progression of authorization scenarios, each one building on the last: The simple case: validate a token and block everything else. Triggering an interactive sign-in from VS Code for an MCP server that APIM hosts from your own APIs. Going beyond "is this a tenant user" to "does this user have the right attribute" with Entra app roles. Fronting an existing external MCP server and letting it drive its own OAuth flow (GitHub as the example). Governing which tools of an existing MCP server an agent is actually allowed to invoke. APIM MCP capabilities and the basic authorization options API Management exposes MCP servers in two distinct ways, and the authorization story differs slightly for each. Expose a REST API as an MCP server. APIM takes an API it already manages and projects selected operations as MCP tools. You own the operations, so you choose exactly which ones become tools at configuration time. This is the right mode when the capability you want to expose is an API you control. Expose an existing MCP server (passthrough). APIM fronts a remote MCP-compatible server (LangChain, an Azure Function, GitHub's remote MCP server, your own container) and relays the MCP protocol to it. APIM governs access, but the upstream server still owns its tool catalog. On top of either mode, you have a spectrum of authorization options: Subscription keys for simple, machine-to-machine access where a shared secret in a header is acceptable. Token validation with Microsoft Entra ID, where APIM acts as the protected resource and verifies a bearer token on every call. Interactive OAuth 2.1 sign-in, where APIM advertises Protected Resource Metadata so an MCP client can discover the authorization server, log the user in, and retry with a user token. Authorization passthrough, where an external MCP server presents its own authorization challenge and APIM relays it faithfully so the client authenticates directly against the upstream's identity provider. The rest of the article works through these options in increasing order of capability. The example setup The walkthroughs in the first three scenarios all use the same backend so you can reproduce them without standing up anything of your own: the publicly available Star Wars API at Star Wars API. It is a simple, read-friendly REST API (characters, films, planets, starships, and so on) imported into API Management as a normal API and then projected as an MCP server. The reason this single API is enough to illustrate the whole progression is that, in API Management, one underlying API can back several independent MCP servers, each exposing a different slice of its operations. For example, you can create: A read-only MCP server that exposes only the GET operations, for agents that should be able to query data but never change it. A write-capable MCP server that exposes the POST, PUT, or DELETE operations, for trusted automation that is allowed to mutate state. Same backend API, two MCP servers, two different tool surfaces. Each of these servers is an independent resource in APIM, so each one can carry its own authorization. Both can require an authenticated user (Scenarios 1 and 2), and you can go further by protecting only the sensitive one: gate the write-capable server behind an Entra app role so that, even among authenticated users, only those who carry a specific claim can reach the mutating tools. That app-role mechanism is the subject of Scenario 3, and it composes naturally with the multi-server split described here. Registering the MCP API in Microsoft Entra ID Before any of the policies below can validate a token, you need an application registration in Microsoft Entra ID that represents the MCP API. This registration is what defines the audience and scope that tokens are issued for, and it is the source of the mcp-audience, mcp-scope, and (indirectly) mcp-client-id values that the policies reference. Create it once and reuse it across all the MCP servers in this article. In the Azure portal, open Microsoft Entra ID, then App registrations, then New registration. Name it (for example, star-wars-mcp-api), choose single-tenant, and register. Record the Application (client) ID and the Directory (tenant) ID. Open Expose an API and add an Application ID URI. Accept the default api://<app-id>. This URI is your token audience. Still under Expose an API, add a delegated scope named MCP.Access, set its consent display name and description, set the state to Enabled, and save. Authorize the client that will request the scope. Under Expose an API, select Add a client application and enter the client ID of the MCP client. For VS Code, this is the built-in Microsoft authentication client aebc6443-996d-45c2-90f0-388ff96faa56. Check the MCP.Access scope and save. These steps produce the four constants the validation policy needs: Named value Comes from Example entra-tenant-id The Directory (tenant) ID from step 1 11111111-1111-1111-1111-111111111111 mcp-audience The Application ID URI from step 2 api://22222222-2222-2222-2222-222222222222 mcp-scope The scope name from step 3 MCP.Access mcp-client-id The client ID of the calling app from step 4 aebc6443-996d-45c2-90f0-388ff96faa56 [!NOTE] mcp-client-id is the identity of the application calling the MCP server, not the MCP API itself. For VS Code it is the built-in Microsoft authentication client, and its value lands in the token's appid claim, which is why the validation policy lists it under client-application-ids. If your tenant blocks the first-party VS Code client, register your own public client application and use its client ID instead. [!TIP] For the privileged-access feature in Scenario 3, you will also declare an app role on this same registration. You do not need it yet, but it is convenient to know that all identity configuration for these servers lives on this one app registration. With that backend and structure in mind, the scenarios below build up the authorization model one capability at a time. Scenario 1: The simple case, validate the token and block unauthorized access The most basic protection is to require a valid Entra ID token on every MCP request and reject anything that fails validation. No interactive flow, no roles, just a gate. APIM does this with the validate-azure-ad-token policy. The policy checks the issuing tenant, the audience (your MCP API), the calling client application, and the required scope. Anything that does not satisfy all four is rejected with a 401. <policies> <inbound> <base /> <validate-azure-ad-token tenant-id="{{entra-tenant-id}}" header-name="Authorization" failed-validation-httpcode="401" failed-validation-error-message="Unauthorized. Access token is missing or invalid."> <client-application-ids> <application-id>{{mcp-client-id}}</application-id> </client-application-ids> <audiences> <audience>{{mcp-audience}}</audience> </audiences> <required-claims> <claim name="scp" match="any"> <value>{{mcp-scope}}</value> </claim> </required-claims> </validate-azure-ad-token> </inbound> <backend> <base /> </backend> <outbound> <base /> </outbound> <on-error> <base /> </on-error> </policies> The values in double braces are APIM named values: centralized constants, defined once and shared by every MCP server. They map directly to the four values produced by the Entra app registration in the example setup (entra-tenant-id, mcp-audience, mcp-scope, and mcp-client-id). Storing them as named values keeps the policy free of hardcoded identifiers and lets every server reuse the same configuration. This gets you a server that nobody can call without a properly minted token. What it does not do is help a fresh client obtain that token in the first place. That is the next scenario. Scenario 2: Driving an interactive sign-in from VS Code for an APIM-hosted MCP server When you expose one of your own APIs as an MCP server, you usually want a developer to open VS Code, connect to the server, and be prompted to sign in with their Microsoft account. No pre-shared key, no manual token handling. APIM achieves this by behaving as a well-mannered OAuth 2.1 protected resource. Using the Star Wars MCP server from the example setup, each selected operation becomes a tool the agent can call, so an agent can answer "which films featured the character named Leia" by calling the underlying API through APIM. How the sign-in flow works The protocol choreography is what turns a plain 401 into an interactive login: Two ingredients make this work: a 401 challenge that points to a metadata document, and the metadata document itself. The challenge: a 401 that points the client to its metadata Instead of a bare 401, APIM returns a WWW-Authenticate header carrying the URL of the server's Protected Resource Metadata. This is what tells the client "you need a token, and here is where to learn how to get one." Keeping this logic in a shared policy fragment means every MCP server reuses it. Notice the mcpResourceMetadataUrl reference in the fragment below. It is not hardcoded; it is a context variable that each MCP server sets in its own server-level policy before including this fragment (you will see that wiring in the per-server policy later in this scenario). The fragment simply reads whatever value the calling server provided. This indirection is what keeps the fragment pluggable: the same shared challenge-and-validate logic serves every MCP server, while each server supplies its own PRM URL. In most deployments the PRM endpoint is a single, dynamic one (built in the next section) that derives the resource from the request path, so the variable just carries that server's path. But because the URL is configurable per server rather than baked into the fragment, you retain flexibility for the cases that need it. <fragment> <!-- No token: challenge with the per-server PRM URL set by the caller --> <choose> <when condition="@(!context.Request.Headers.ContainsKey("Authorization"))"> <return-response> <set-status code="401" reason="Unauthorized" /> <set-header name="WWW-Authenticate" exists-action="override"> <value>@("Bearer resource_metadata=\"" + (string)context.Variables.GetValueOrDefault("mcpResourceMetadataUrl", "") + "\"")</value> </set-header> </return-response> </when> </choose> <!-- Token present: validate against shared named values --> <validate-azure-ad-token tenant-id="{{entra-tenant-id}}" header-name="Authorization" failed-validation-httpcode="401" failed-validation-error-message="Unauthorized. Access token is missing or invalid."> <client-application-ids> <application-id>{{mcp-client-id}}</application-id> </client-application-ids> <audiences> <audience>{{mcp-audience}}</audience> </audiences> <required-claims> <claim name="scp" match="any"> <value>{{mcp-scope}}</value> </claim> </required-claims> </validate-azure-ad-token> </fragment> Creating the /.well-known PRM endpoint in APIM with a policy This is the part that often surprises people: APIM itself serves the metadata document. There is no separate identity service to stand up. You publish one small anonymous API at the service root that answers GET /.well-known/oauth-protected-resource/*, derives the resource value from the requested path, and returns a JSON document pointing at Microsoft Entra ID as the authorization server. Create a blank HTTP API named well-known with an empty API URL suffix so it resolves at the service root, add a GET operation with the template /.well-known/oauth-protected-resource/*, clear the subscription requirement so it is reachable anonymously, and apply this policy: <policies> <inbound> <base /> <!-- Build the resource URL from the requested PRM sub-path --> <set-variable name="resourceUrl" value="@{ var prefix = "/.well-known/oauth-protected-resource"; var path = context.Request.OriginalUrl.Path; var resourcePath = path.Length > prefix.Length ? path.Substring(prefix.Length) : ""; return "https://" + context.Request.OriginalUrl.Host + resourcePath; }" /> <return-response> <set-status code="200" reason="OK" /> <set-header name="Content-Type" exists-action="override"> <value>application/json</value> </set-header> <set-body>@{ return new JObject( new JProperty("resource", (string)context.Variables["resourceUrl"]), new JProperty("authorization_servers", new JArray( "https://login.microsoftonline.com/{{entra-tenant-id}}/v2.0")), new JProperty("scopes_supported", new JArray("{{mcp-prm-scope}}")), new JProperty("bearer_methods_supported", new JArray("header")) ).ToString(); }</set-body> </return-response> </inbound> <backend> <base /> </backend> <outbound> <base /> </outbound> <on-error> <base /> </on-error> </policies> The {{mcp-prm-scope}} named value populates the scopes_supported array of the metadata document. It tells the client which delegated scope to request when it goes to the authorization server, so it must be the fully qualified scope value: the token audience (the Application ID URI from the app registration) followed by the scope name. With the example values that is api://22222222-2222-2222-2222-222222222222/MCP.Access. In other words, it is the combination of the mcp-audience and mcp-scope values defined in the example setup. Named value Value to set Example mcp-prm-scope <mcp-audience>/<mcp-scope> api://22222222-2222-2222-2222-222222222222/MCP.Access [!NOTE] Keep mcp-prm-scope in sync with the scope the validation fragment requires. The PRM document advertises this scope so the client requests it, and validate-azure-ad-token then checks for it in the scp claim. A mismatch means the client obtains a token without the scope APIM expects, and validation fails. Because the policy builds the resource value from the request path, this single endpoint serves metadata for every MCP server you ever add. The Star Wars server, a future inventory server, and anything else all share it. Wiring it onto the MCP server Each MCP server only needs to declare its own metadata URL and include the shared fragment: <policies> <inbound> <base /> <set-variable name="mcpResourceMetadataUrl" value="https://apim-contoso-mcp.azure-api.net/.well-known/oauth-protected-resource/star-wars-mcp/mcp" /> <include-fragment fragment-id="mcp-entra-auth" /> </inbound> <backend> <base /> </backend> <outbound> <base /> </outbound> <on-error> <base /> <include-fragment fragment-id="mcp-auth-challenge-onerror" /> </on-error> </policies> On the VS Code side, the configuration is deliberately plain. With no subscription-key header present, the client falls straight into the OAuth flow: { "servers": { "star-wars-mcp": { "url": "https://apim-contoso-mcp.azure-api.net/star-wars-mcp/mcp", "type": "http" } } } Restart the server in VS Code, and it detects the 401, reads the metadata, opens a browser sign-in, requests consent on first use, and then loads the tools using the user's token. [!CAUTION] Do not read the response body with context.Response.Body inside MCP server policies. It forces response buffering and breaks the MCP streaming transport. If global diagnostic logging is enabled, set the Frontend Response payload bytes to log to 0 at the All APIs scope. Scenario 3: Beyond tenant membership, authorize on a user attribute with app roles Validating a token confirms the caller is a signed-in user in your tenant with the right scope. That is often not enough. Some MCP servers expose sensitive tools that only a subset of users should reach. You want to express "this user is not only part of the tenant, but has a specific attribute that permits this server." Microsoft Entra app roles are the optimal mechanism for this. You declare a role on the MCP API app registration, assign it to specific users or to a security group, and Entra ID emits a roles claim in the access token whenever your API is the audience. APIM then authorizes on that claim. App roles beat the groups claim here because they avoid the group overage problem, they are scoped to the application, and they travel with the app. Declaring and assigning the role On the MCP API app registration, under App roles, create a role: Setting Value Display name Privileged Access Allowed member types Users/Groups Value Privileged.Access Description Access to privileged MCP servers Then, on the matching enterprise application, under Users and groups, assign the users (or, better, a security group) to the Privileged Access role. The Value field is the exact string that lands in the token roles claim, so it cannot contain spaces. [!TIP] Keep User assignment required set to No on the enterprise application. Unassigned users still obtain a valid token with the MCP.Access scope and keep access to the non-privileged servers. They simply do not carry the roles claim, so the privileged servers reject them. Enforcing the claim in the per-server policy The shared mcp-entra-auth fragment is used by every server, so the role requirement must not live there. Place the check in the privileged server's own policy, right after the fragment include. The token is already validated at that point, so this step is pure authorization. Because the caller is authenticated but not authorized, return 403, not 401, and do not emit a challenge: re-authenticating will not grant a role the user does not have. <policies> <inbound> <base /> <set-variable name="mcpResourceMetadataUrl" value="https://apim-contoso-mcp.azure-api.net/.well-known/oauth-protected-resource/star-wars-mcp/mcp" /> <include-fragment fragment-id="mcp-entra-auth" /> <!-- Privileged guardrail: require the Privileged.Access app role --> <choose> <when condition="@(!context.Request.Headers.GetValueOrDefault("Authorization","").Replace("Bearer ","").AsJwt().Claims.GetValueOrDefault("roles", new string[0]).Contains("Privileged.Access"))"> <return-response> <set-status code="403" reason="Forbidden" /> <set-header name="Content-Type" exists-action="override"> <value>application/json</value> </set-header> <set-body>{"error":"forbidden","message":"You lack the Privileged.Access role required for this MCP server."}</set-body> </return-response> </when> </choose> </inbound> <backend> <base /> </backend> <outbound> <base /> </outbound> <on-error> <base /> <include-fragment fragment-id="mcp-auth-challenge-onerror" /> </on-error> </policies> One operational detail worth calling out: app-role assignments only appear in newly issued tokens. A user who is granted the role after they signed in must obtain a fresh token. In VS Code, run MCP: Reset Cached Tokens (or sign out of the Microsoft account from the Accounts menu), then restart the server and sign in again. You can confirm the result by pasting the access token into https://jwt.ms and checking for "roles": ["Privileged.Access"]. Scenario 4: Fronting an existing external MCP server that drives its own sign-in So far APIM has been the authorization resource. But many valuable MCP servers already exist and run their own identity. GitHub publishes a remote MCP server with dozens of tools, and it authenticates users against GitHub's own OAuth authorization server. You do not want to re-implement that. You want APIM to govern access (rate limits, IP rules, logging, a single managed endpoint) while letting the upstream own the login. This is the "expose an existing MCP server" passthrough mode. When you register GitHub's remote MCP server behind APIM, the gateway relays the upstream's own authorization challenge. The client never authenticates against Entra here. It authenticates directly against GitHub. The flow, confirmed by probing the gateway: A call to the APIM endpoint with no token returns GitHub's own 401 with a WWW-Authenticate header, relayed through APIM. The Protected Resource Metadata that GitHub serves advertises authorization_servers: ["https://github.com/login/oauth"], so the client knows to log in at GitHub. The PRM resource reflects the APIM host, because GitHub builds it from the forwarded Host header. The client trusts the APIM endpoint while still logging in at GitHub. VS Code completes the GitHub sign-in and the full tool catalog loads. In the proof of concept this surfaced all 47 GitHub tools through the single APIM endpoint. The client configuration is again just a URL pointing at APIM: { "servers": { "github-via-apim": { "url": "https://apim-contoso-mcp.azure-api.net/github-mcp/mcp", "type": "http" } } } The key insight is that APIM transparently relays the backend's authentication challenge. GitHub remains the authorization server, GitHub tolerates being fronted by APIM, and you get a governed, centrally managed entry point without owning the identity flow. [!NOTE] Passthrough only relays what the upstream advertises. If the backend's PRM resource value and the actual MCP transport endpoint differ by a path segment, some clients fall back to deriving the metadata location from the server URL and can miss it. When you onboard a custom self-authenticating server, verify that the resource it advertises matches the exact URL the client connects to. Scenario 5: Restricting which tools of an existing MCP server an agent may call Passthrough raises a governance question that token validation alone cannot answer. A developer may legitimately have permission to merge a pull request through GitHub, but you may not want their AI agent to perform that action autonomously. You want to allow the read and discovery tools while blocking the destructive write tools, at the gateway, regardless of what the client tries. What is and is not possible for an external server It is important to be precise here, because the capability differs from the REST-as-MCP mode: For a REST-API-exposed-as-MCP server, you pick which operations become tools at creation time. That is native tool selection and the cleanest possible filter. For an existing/external MCP server, APIM does not enumerate the upstream's tools. The portal Tools blade explicitly states that tools are not visible for external MCP servers, and there is no allow-list property for them. APIM also cannot safely rewrite the tools/list response, because reading the response body breaks the streaming transport and the list may arrive as text/event-stream. What APIM can do reliably, and server-agnostically, is block the invocation. Every tool call arrives as a JSON-RPC tools/call request in the request body, which APIM can inspect safely. The deny-listed tools remain visible in the catalog, but any attempt to invoke one is intercepted at the gateway and returned a JSON-RPC error before it ever reaches the upstream. The reusable deny-list fragment The block is driven by a per-server named value (a comma-separated list of tool names), so the same fragment governs every external server. Only the named value changes. <!-- Fragment: mcp-tool-filter (include after the auth fragment) --> <fragment> <choose> <when condition="@(context.Request.Body != null)"> <set-variable name="mcpMethod" value="@{ try { var body = context.Request.Body.As<JObject>(preserveContent: true); return (string)body?["method"] ?? string.Empty; } catch { return string.Empty; } }" /> <choose> <when condition="@(((string)context.Variables["mcpMethod"]).Equals("tools/call", StringComparison.OrdinalIgnoreCase))"> <set-variable name="mcpToolName" value="@{ var body = context.Request.Body.As<JObject>(preserveContent: true); return (string)body?["params"]?["name"] ?? string.Empty; }" /> <!-- mcpBlockedTools is a comma-separated deny-list set by the per-server policy before this include --> <set-variable name="mcpBlocked" value="@{ var tool = ((string)context.Variables["mcpToolName"]).Trim().ToLowerInvariant(); var deny = ((string)context.Variables.GetValueOrDefault("mcpBlockedTools", "")).ToLowerInvariant().Split(',').Select(t => t.Trim()); return deny.Contains(tool); }" /> <choose> <when condition="@((bool)context.Variables["mcpBlocked"])"> <return-response> <set-status code="200" reason="OK" /> <set-header name="Content-Type" exists-action="override"> <value>application/json</value> </set-header> <set-body>@{ var id = "null"; try { var body = context.Request.Body.As<JObject>(preserveContent: true); id = body?["id"]?.ToString(Newtonsoft.Json.Formatting.None) ?? "null"; } catch {} return "{\"jsonrpc\":\"2.0\",\"id\":" + id + ",\"error\":{\"code\":-32602,\"message\":\"Unknown tool: " + ((string)context.Variables["mcpToolName"]) + "\"}}"; }</set-body> </return-response> </when> </choose> </when> </choose> </when> </choose> </fragment> The deny-list itself lives in a named value, one per server: APIM named value. Comma-separated, case-insensitive. mcp-blocked-tools-github = merge_pull_request,create_repository,delete_repository,push_files,create_or_update_file,issue_write,label_write # <policies> <inbound> <base /> <set-variable name="mcpResourceMetadataUrl" value="https://apim-contoso-mcp.azure-api.net/.well-known/oauth-protected-resource/github-mcp/mcp" /> <include-fragment fragment-id="mcp-entra-auth" /> <set-variable name="mcpBlockedTools" value="{{mcp-blocked-tools-github}}" /> <include-fragment fragment-id="mcp-tool-filter" /> </inbound> <backend> <base /> </backend> <outbound> <base /> </outbound> <on-error> <base /> <include-fragment fragment-id="mcp-auth-challenge-onerror" /> </on-error> </policies> Generic per-server pattern: mcp-blocked-tools-<server> = <comma,separated,tool,names> Wiring it onto the GitHub passthrough server <policies> <inbound> <base /> <set-variable name="mcpResourceMetadataUrl" value="https://apim-contoso-mcp.azure-api.net/.well-known/oauth-protected-resource/github-mcp/mcp" /> <include-fragment fragment-id="mcp-entra-auth" /> <set-variable name="mcpBlockedTools" value="{{mcp-blocked-tools-github}}" /> <include-fragment fragment-id="mcp-tool-filter" /> </inbound> <backend> <base /> </backend> <outbound> <base /> </outbound> <on-error> <base /> <include-fragment fragment-id="mcp-auth-challenge-onerror" /> </on-error> </policies> Now when the agent tries to merge a pull request, the gateway returns a clean -32602 Unknown tool error and the upstream is never touched. Read and discovery tools continue to work. The tool still appears in the client's catalog. Adding governance for another external server is just one more named value plus the same fragment include. No new policy logic. Key takeaways API Management turns MCP servers into governed resources, applying the same identity, traffic, and observability controls you already use for APIs. Start simple with validate-azure-ad-token to gate access, then graduate to a full interactive sign-in by serving Protected Resource Metadata from a single APIM policy. You can publish multiple MCP servers from one underlying API, for example a read-only server and a read-write server, by selecting different operations. App roles let you authorize on a user attribute, not just tenant membership, and the check belongs in the per-server policy so shared logic stays clean. For existing external servers, APIM relays the upstream's own OAuth flow, so a server like GitHub keeps owning its identity while you keep central governance. When an external server's full tool surface is too broad, APIM can block specific tool invocations at the gateway with a reusable, named-value-driven policy, so a user's agent cannot perform actions the user could perform manually. References About MCP servers in Azure API Management Secure access to MCP servers in API Management Expose REST API in API Management as an MCP server Expose and govern an existing MCP server validate-azure-ad-token policy reference Policy fragments in API Management RFC 9728: OAuth 2.0 Protected Resource Metadata MCP authorization specification Star Wars API (example backend) MCP for BeginnersDeploy an Azure Landing Zone in About Twelve Minutes with the ALZ IaC Accelerator
Hello Folks! Welcome back to my coverage of the Microsoft Azure Infra Summit 2026. This session is one I have been looking forward to, because if you have ever stood up an Azure Landing Zone (ALZ) by hand, you know it can eat weeks. Management groups, policy assignments, Hub-and-Spoke networking, log analytics, Defender for Cloud, identities, pipelines, governed branches. There is a lot of plumbing. In this session Jack Tracy (he leads the Azure Landing Zones team) and Jarrod Holgate (tech lead on Azure Landing Zones and Azure Verified Modules) walk through the ALZ Infrastructure as Code Accelerator. Then they actually run it, and a bootstrap that used to be a multi-week journey wraps up in about twelve minutes of typing and ticking boxes. 📺 Watch the session: Why IT Pros Should Care If you are the person who has to deliver a secure, governed Azure platform before your dev teams can land their first workload, this matters to you. Here is the short version of why: It bakes in the Cloud Adoption Framework “start right, stay right” pattern so you do not have to invent it. It supports both Bicep and Terraform, and it bootstraps GitHub or Azure DevOps for you (with a local file system option for GitLab, Bitbucket, or whatever else you run). It covers roughly 80% of common customer scenarios out of the box. You do not have to write modules from scratch. It is open source, every module is published, and you can fork or compose as you see fit. It is now built entirely on Azure Verified Modules (AVM), so what you deploy is aligned with the Well-Architected Framework by default. In short, if you have been hand-crafting management group hierarchies and policy assignments in the portal, stop. There is a better way, and the team that designs ALZ ships it as code you can actually read. What is the ALZ IaC Accelerator A quick recap, because it is worth getting the vocabulary right. The Azure Landing Zone lives inside the CAF Ready methodology. It is the shared platform (networking, identity, logging, policy, management groups) that supports the many application landing zones your workload teams consume. Jack uses a great analogy in the session: think of a metropolis. Before residents and businesses can move in, you need water, gas, electricity, and roads. The platform landing zone is the utilities layer. The application landing zones are the buildings. The ALZ IaC Accelerator is the tooling that deploys and manages that platform layer using declarative infrastructure as code. It is composed of: A set of IaC modules in Bicep and Terraform (all of them built on AVM). A bootstrap layer for GitHub or Azure DevOps (or local file system). The ALZ PowerShell module, published to the PowerShell Gallery, which orchestrates everything. Comprehensive docs covering prereqs, scenarios, and options. The accelerator is a Microsoft-supported, open source path to a production-grade landing zone. You should look at it before you decide to roll your own. How it works The accelerator runs in four phases. Jarrod walks through each of them in the demo. Phase 0: Plan. You make decisions: Bicep or Terraform, GitHub or Azure DevOps, single or multi-region, Hub-and-Spoke or Virtual WAN, Azure Firewall or NVA, DDoS on or off, and so on. Phase 1: Prereqs. Before the accelerator runs, you need two things in place: an identity to run the bootstrap, and the platform subscriptions. Traditionally this was four (connectivity, identity, management, security). There is now a new lighter option that needs only two subscriptions for smaller environments. Phase 2: Bootstrap. This is where the magic happens. You feed it a bootstrap configuration file plus a platform landing zone configuration file, then run the Deploy-Accelerator command. The PowerShell module deploys identities, optional Terraform state storage with private networking, optional self-hosted container-instance runners, and then sets up your repositories, pipelines, environments, governed pipeline templates, and OIDC-based service connections using Workload Identity Federation. No manual steps after Phase 2. Phase 3: Deploy. Run the CD pipeline. The platform landing zone deploys. Done. A few things worth highlighting about the bootstrap: The accelerator deploys two identities: one with read-only for plan / what-if, one with write for apply / deploy. Least privilege, out of the box. Pipelines are governed. The actual deployment pipeline lives in a separate template repository, so changes to it require an approval. A CI pipeline runs on pull requests automatically. You get the engineering hygiene without configuring it. Real-world scenarios and when to use it Jarrod calls these “scenarios” and “options”. They are the difference between picking a starting pattern (scenario) and tuning it (options). Scenarios. There are 11 of them out of the box. Pick the one that matches your starting state: Single region, Hub-and-Spoke, Azure Firewall. Multi-region, Hub-and-Spoke, Azure Firewall. Single or multi-region with Virtual WAN. Single or multi-region with a third-party NVA. No-connectivity (governance only, no Hub networking) for organizations who are not ready for centralized networking yet. New scenarios 10 and 11, which are cost-optimized for small and medium businesses with around 10 workloads. Same modules, same orchestration, just a smaller, cheaper starting shape. Sovereign landing zone for customers with data sovereignty and confidential compute requirements. Options. Once you pick a scenario, you can tune it. The 16 documented options are the ones the team sees customers ask about most often: customizing resource names, customizing management group names, turning the DDoS protection plan on or off, choosing the sovereign baseline, and more. Behind those, Terraform alone exposes hundreds of variables. Honest tradeoffs (because Pierre always tells you the rough edges): OpenTofu is not supported today. Just Bicep and Terraform. Personal Access Tokens are still required for Azure DevOps and self-hosted agents at the time of the session. The team has confirmed CLI / managed identity support is on the roadmap. Brownfield is “it depends”. The accelerator is greenfield-friendly. Retrofitting an existing tenant is possible but is going to depend on your current state and your risk appetite. You still own decisions. The Lady Justice slide in the session is a great reminder: balancing dev team freedom with central governance is your job. The accelerator gives you the controls; it does not pick your policy posture for you. Getting Started If you want to try this without waiting, here is the path Jarrod actually demoed: Install the ALZ PowerShell module from the PowerShell Gallery. Create your platform subscriptions (two minimum, four for the classic layout) and an identity for the bootstrap. Run Deploy-Accelerator with no parameters. It will prompt you interactively for everything: region, parent management group, subscriptions, naming convention, self-hosted agents yes or no, private networking yes or no, PAT, project name, and approvers. Review the two generated configuration files: the bootstrap config and the platform landing zone tfvars (or Bicep params). Confirm. The bootstrap runs Terraform behind the scenes and wires up Azure plus your repos. Run the CD pipeline. Approve at the apply stage. Your platform deploys. If you are not ready to drive Terraform directly, the Azure Migrate AI agent (in preview) wraps the exact same accelerator codebase behind a guided chat experience. You answer questions, it produces a zip with the same two config files plus a design document explaining the decisions it made. Then you hand that off to the same pipeline. The Azure MCP server has matching tooling for VS Code, so day-two changes like “turn off the DDoS protection plan” know to also uncomment the dependent policy assignments in the archetype files. That is the kind of context-aware editing that saves you from breaking your own deployment. Resources Azure Landing Zone in the Cloud Adoption Framework ALZ Accelerator hub (entry point for docs, scenarios, options) ALZ Terraform Accelerator on GitHub ALZ-Bicep on GitHub Azure Landing Zones Library (policies and archetypes) Azure Verified Modules Raise issues or feedback for the ALZ team Watch the rest of the Summit If you found this useful, the full Microsoft Azure Infra Summit 2026 playlist has a lot more: deployment stacks, Bicep beyond the basics, IaC CI/CD best practices, AVM with GitHub Copilot, and plenty of AKS and storage sessions. Grab the playlist here: Microsoft Azure Infra Summit 2026 on YouTube. Hit the ALZ team in the comments on the session, or open an issue on the repo. The team is genuinely active there. Cheers! Pierre Roman189Views1like1CommentBuilding an Azure architecture that’s ready for every signature
At Exclaimer, we help organizations manage email signatures at scale, so every message can carry a consistent, compliant, on-brand signature without IT teams manually updating thousands of mailboxes. This is more difficult than it may seem, especially when you're doing it for more than 80,000 customers, around 9.6 million seats, and more than 21 billion emails a year. Every signature must show up in the right place, with the right details, for the right sender, recipient, device, and business rule. Behind that are constantly changing employee records, customer-specific policies, email chains, recipient lists, regional disclaimers, and brand requirements. Because our platform sits directly in the email flow, availability is critical. And because many of our customers operate in regulated industries, they also need confidence that data stays in-region and configured signatures are applied consistently. To support that level of scale and reliability, we’ve spent the last several years evolving our architecture on Microsoft Azure. Today, Azure Kubernetes Service (AKS), Azure SQL Database, Azure Database for PostgreSQL, Azure Cosmos DB, Azure Data Explorer, and Azure Databricks help us run a global platform that’s more responsive, more resilient, and more cost-efficient. Reading the signs that our architecture needed to change In the beginning, our cloud product ran more like a multi-server, on-premises product hosted on Azure Virtual Machines (VMs). The platform was split into a smaller number of core services, and the team relied heavily on VM-based infrastructure to keep those services running. As Exclaimer grew, our architecture had to keep pace with higher volumes, more regions, and more complex customer requirements. Regional demand shifted throughout the day, but scaling infrastructure up and down still relied on scripts, pre-baked VMs, and operational coordination. That created more risk during maintenance and failover. We run parallel data centers in regional pairs so we can move traffic away from one site when needed. But when traffic moves, the receiving environment has to be ready to handle the full load. In the VM world, that meant someone or something had to remember to scale up standby resources at the right moment. At the same time, our product was becoming more service-oriented. We were moving away from a smaller set of larger services toward well over 100 microservices. Every new service created more conversations about VM sizing, images, patching, and operational overhead. It was time for a model that could scale faster, run more efficiently, and reduce the amount of infrastructure work required to ship and operate the product. Signing on to AKS for faster, more efficient scaling By moving many workloads to Linux containers on AKS, we gained a smaller footprint, faster startup times, and a more consistent way to package and deploy services. AKS also gave us a managed Kubernetes foundation for running those containers at global scale, with autoscaling capabilities that better matched our traffic patterns. With Horizontal Pod Autoscaler, services can react to load in seconds rather than minutes. With Cluster Autoscaler, we can add or remove node capacity based on what the platform actually needs. That means we can pack workloads onto nodes more efficiently, scale down during quiet periods, and scale up quickly when demand returns. The operational difference is just as important. During an incident, maintenance event, or regional failover, our teams have fewer manual steps to think about. If traffic shifts, the platform can scale with it. That takes away one more thing for engineers to worry about when they should be focused on keeping the customer experience steady. The move to containers and a more streamlined CI/CD workflow also improved our deployment cadence by making it easier to build, test, and deploy changes across the platform. In 2021, we deployed 285 changes, features, and fixes to production over the course of the entire year. Today, we deploy that many every few days. Cost has improved, too. Since 2024, when the bulk of our migration to containerized services took place, we’ve reduced our average cost per user by about 39 percent, even as the product has grown more complex and we’ve added more capabilities for customers. We achieved that through a combination of containerized architecture, AKS autoscaling, and expanded reservations across compute and storage technologies. Choosing the right database for the right kind of data We started with a strong Microsoft SQL Server foundation, and Azure SQL Database remains core to our platform today. It stores critical customer configuration data and continues to give us the reliability, replication, resizing flexibility, and regional scale we need. But not every workload belongs in the same database. Customer configuration, relational service data, key-value storage, usage events, and business intelligence (BI) all have different access patterns. That principle led us to Azure Database for PostgreSQL flexible server for one of our most important migrations. We had used Azure Table storage for a core service that needed to retrieve customer data quickly. It was cost-effective and stable for a long time, but as the product evolved, the data became more relational, and we found ourselves adding complexity in application code that a relational database could handle more naturally. Azure Database for PostgreSQL gave us that relational model with low management overhead, fast read replicas, reserved instances for predictable workloads, and a path to future scale. After the migration, average request time for a critical service dropped from 18.6 milliseconds to 1.79 milliseconds. That’s a 90 percent improvement across a service that handles around 9 billion requests each month. Azure Cosmos DB plays a different role, supporting key-value and document storage where we need scale, availability, low latency, encryption at rest, and straightforward dev/test support. Optimized for unstructured data and high-performance reads and writes, it gives us a highly scalable foundation for workloads that don't fit a traditional relational model. We use it to store customer assets for signatures and video branding, high-volume metadata for internal message-processing operations, audit events that help customers track account changes, and tokens used to collect data from third-party systems on behalf of customers. It also gives us a clean way to keep data and services aligned. Azure Data Explorer solved another scaling challenge: usage and billing data. We need to be able to audit the number of messages we process for our customers so we can bill accurately, and at more than 20 billion emails a year, our previous SQL-based usage pipeline became difficult to manage. With Azure Data Explorer, we can ingest massive volumes of event data at low storage cost, connect to Azure Event Hubs, and avoid maintaining custom plumbing. That move reduced the cost of the system by around 70 percent. Azure Databricks rounds out the picture as our BI and data platform, giving our teams a shared foundation for transformations, analysis, and reporting across product and business data. Keeping every region ready for business Our customers are everywhere, so our platform has to be, too. Exclaimer runs in seven distinct geographic locations: Australia, Canada, Europe, Germany, the United Arab Emirates, the United Kingdom, and the United States. That global footprint helps us meet customer expectations around availability and data residency. Many organizations want their data to stay in-region, and Azure gives us the coverage we need to support that. Availability is especially important because our platform is part of a live communication flow. When someone sends an email, they expect it to keep moving. Our Azure architecture helps us support that expectation across the stack. AKS lets compute scale with regional demand. Azure SQL and Azure Database for PostgreSQL support critical relational workloads. Azure Cosmos DB gives us scalable, low-latency storage for document and key-value patterns. Azure Data Explorer handles very high-volume usage ingestion without the complexity of our former custom pipeline. Across the board, these managed Azure services reduce the amount of operational work our engineers have to carry. We can spend less time maintaining the basics and more time tuning performance, improving stability, and building the capabilities our customers need next. Building for the future on a stronger foundation The biggest sign that our architecture is working may be how little we have to reinvent when we build something new. As we develop upcoming product capabilities, we already have many of the foundational pieces in place: AKS for compute, Azure Cosmos DB for state, and Azure Service Bus for messaging. We also have Azure SQL for core data, Azure Database for PostgreSQL where relational service data needs room to scale, Azure Data Explorer for high-volume event analysis, and Azure Databricks for BI tooling. Together, these services make our platform faster, more efficient, and more resilient. Email signatures may look simple on the surface. Behind every one, there’s a set of decisions about performance, scale, data, availability, and trust. With Azure, we’ve built an architecture that helps us keep every signature moving, wherever our customers do business. About the authors Phil Vetter started in engineering at Exclaimer as a developer at the start of 2013, and now sits at the helm as VP of Engineering. Lee Jones started at Exclaimer in 2013 in the IT department, and now serves as Director of Platform Engineering, managing the infrastructure and resilience of Exclaimer Cloud.136Views0likes0CommentsFrom Prompt to Provisioned: A Closer Look at the Azure Deployment Agent
Hello Folks! If you sat through this session during the Microsoft Azure Infra Summit 2026, you already know that Anand Guruswami and Arun Rabindar from the Cloud Native Experiences team showed us something I have been waiting to see for a while. An AI agent that does not just spit out a Terraform file from a vague prompt, but actually thinks about your workload, talks to you about it, and then hands you something you can put in front of a pull request reviewer without holding your nose. This is the Azure Deployment Agent, and at the time of broadcast it was still in preview inside Azure Copilot, with the same brains shipping as an open source skill you can plug into GitHub Copilot, Claude Code, Cursor, or whatever your team uses. In this post I want to break down what they showed, why it matters for IT pros, and how you can get hands on with it. 📺 Watch the session: Why IT Pros Should Care Let us be honest about the day to day. Most of the time we are not building a brand new workload from a blank canvas. We are stitching resources together one at a time, copying patterns from a previous project, hunting down the right SKU, checking quotas, then arguing with policy on the way out the door. Different admins do it different ways, and that inconsistency is where risk lives. Here is what the Deployment Agent changes for us: It moves the conversation up a level, from “which resource do I click” to “what am I actually trying to build.” It grounds the architecture in the Azure Well-Architected Framework, so the output is not a generic LLM guess, it has reasoning behind it. It separates the plan from the code, so you and your team get to review architecture before any Terraform or Bicep gets written. It plugs into the tools we already use. Azure portal for the guided path, GitHub Copilot and Claude Code for the power user path. In short, it's about taking the boring repetitive parts off our plate so we can focus on the parts that need human judgment. What is the Azure Deployment Agent The Deployment Agent is a capability inside the Agents (preview) experience in Azure Copilot. Think of it as a virtual cloud solution architect that lives in your Copilot chat. You describe the workload in natural language, and it walks you through a multi step process to land on a production ready deployment. A few things that stood out from Anand’s portion of the session: It supports multi turn conversation. You can clarify scale, security posture, resilience, SKU preferences, region constraints, and the agent will fold those into the plan. It produces a human readable infrastructure plan first, complete with trade offs and the reasoning for each resource choice, before it ever writes infrastructure as code. Today it generates Terraform inside the portal, with Bicep support landing in the portal experience shortly. In the GitHub Copilot flow you can already pick Bicep or Terraform. Once the plan is approved, you get a real artifact. You can open it in VS Code for the Web, or have Copilot open a pull request straight into your GitHub repo. The deployment itself still goes through Azure Resource Manager. That is important. Your tenant policies, RBAC, naming conventions, and existing guardrails all still apply. The agent is not bypassing your governance, it is generating code that flows through it. How it Works Arun did a great job pulling back the curtain on the internals. The agent follows a two step pattern that gives you control at every checkpoint. Intent capture. The agent takes your prompt and clarifies the scope, the constraints, and what success looks like. No guessing, no jumping straight to YAML. Plan generation. It produces a structured infrastructure plan with inputs, sub goals, a full resource list, configurations, SKUs, and a per resource reasoning section. Validation in a loop. The plan runs through evaluators backed by the Well-Architected Framework pillars (reliability, security, cost, operational excellence, performance efficiency). If something fails, the agent regenerates and tries again until the plan is solid. Human review. The plan is presented to you in plain language. You can iterate. You can say “prioritize West US 2,” or “swap that SKU,” and the agent will update the plan in place. Code generation. Only after you approve the plan does the agent emit Terraform or Bicep. The generated code goes through syntactic validation as well, again in a loop, so it actually parses and is ready to apply. Under the hood in the GitHub Copilot and Claude Code path, the team has decomposed all of this into an open source skill (the Azure Enterprise Infrastructure Planner) plus the Azure Well-Architected Framework as an MCP tool. The base agent in your editor picks up the skill, runs the phases, calls the MCP tool to ground the output, and then writes the IaC. Same workflow, different host. When to Use it / Real-World Scenarios This is not just a toy for greenfield demos. A few places where I see this paying real dividends: New workload bootstrapping. A team needs a web app, SQL backend, secrets in Key Vault, monitoring, and a sane region strategy. Instead of three days of clicking and copy pasting, you describe it and review the plan. CSV ingestion to SQL automation. The Claude Code demo Arun ran was exactly this. CSV lands, gets processed, rows update in SQL. The agent picked sensible resources, justified each one, and produced Bicep ready to commit. Standardizing across teams. Different admins ending up with different shapes for the same workload is the silent killer of operational consistency. A shared agent with a shared planner skill drags everyone toward the same Well-Architected baseline. Skill leverage for smaller teams. Not every team has a deep Azure architect on staff. The agent encodes a lot of that experience and surfaces it as conversation. Open source customization. Because the skill and MCP tooling are open, platform teams in regulated environments can fork it, add their policy context, their tagging rules, their naming conventions, and ship a tuned version internally. One honest tradeoff. Right now the agent is greenfield first. The team is actively working on brownfield scenarios, pulling insights from existing workloads and referencing existing resources. If you live entirely in a complex existing estate, expect the experience to keep getting better over the next couple of releases. Getting Started If you want to try it this week, here is the short list: Ask your Azure tenant administrator to enable Agents (preview) in Azure Copilot. The toggle lives in the Azure Copilot admin center, and without it you will not see agent mode in chat. In the Azure portal, open Copilot, expand to full screen, and switch on Agent mode at the bottom of the chat panel. Describe a workload in plain language. Be specific about region, scale expectations, and any compliance constraints you care about. Review the generated plan before approving. Look at the trade offs section, that is where the agent shows its work. For the editor path, install the open source Azure Skills plugin from the microsoft/azure-skills repo, point your IDE at the Azure MCP Server, and run the same workflow inside GitHub Copilot or Claude Code. Send feedback. The team is shipping fast and the roadmap (brownfield support, reference workloads, scoped agent permissions, richer architecture diagrams) is shaped by what you tell them. Resources Deployment agent capabilities in Agents (preview) in Azure Copilot: https://learn.microsoft.com/en-us/azure/copilot/deployment-agent microsoft/azure-skills, the open source skill plugin shown in the session: https://github.com/microsoft/azure-skills Azure MCP Server on the GitHub MCP Registry: https://github.com/mcp/com.microsoft/azure Azure MCP Server tools for the Well-Architected Framework: https://learn.microsoft.com/en-us/azure/developer/azure-mcp-server/tools/azure-well-architected-framework Azure Well-Architected Framework documentation: https://learn.microsoft.com/en-us/azure/well-architected/ Agents (preview) in Azure Copilot overview: https://learn.microsoft.com/en-us/azure/copilot/agents-preview Watch the rest of the Summit If you enjoyed this session, the full Microsoft Azure Infra Summit 2026 playlist is up on YouTube. Sessions on Deployment Stacks, the SRE Agent, Azure Local, AKS networking, and a lot more are all in there. Bookmark this one and share it with your team: https://aka.ms/MAIS/2026-Playlist Drop your questions, your war stories, and your wish list for the Deployment Agent in the comments. I read them, the product team reads them, and your scenarios are exactly what shapes the next preview drop. What would you build with it first? Cheers! Pierre Roman131Views0likes0CommentsBuild a Sovereign Private Cloud with Azure Local
Hello Folks! Picture this. A regulator hands you a one-pager that says, in essence, “this data does not leave the building.” Or your link to Azure decides to take a nap during a critical batch run. Or you are standing up infrastructure in a remote site where connectivity is a coin flip on a good day. For a long time, our answer to that conversation was a stack of Azure Stack boxes plus a lot of wishful thinking. That story has changed, and it has changed quite a bit. At Microsoft Azure Infra Summit 2026, Thomas Maurer (Global Black Belt for Sovereign Cloud) walked us through what is now called the Microsoft Sovereign Private Cloud, with Azure Local as its foundation. In this post, I want to unpack the session for the ITPros in the room, the folks who have to actually run this stuff on Monday morning. Let us dig in. 📺 Watch the session: Why IT Pros Should Care Sovereignty is no longer a niche conversation. Thomas was very clear that there is no one-size-fits-all answer, and that is exactly why this matters to us as operators. The drivers landing on our desks now include: Regulatory requirements that demand data residency or full operator isolation. Sovereign AI workloads where the model and the data both need to stay in-country. Disconnected and air-gapped sites by design (think defense, manufacturing floors, retail backrooms, ships, mines). Business continuity, meaning a workable Plan B if the public cloud is unreachable for hours or days. Latency-sensitive workloads where the round trip to a region is just too slow. If you build or operate infrastructure that touches any of those bullets, Azure Local is now a first-class option, not a sidecar. And it gets you a cloud-consistent control plane on top of hardware you can put your hands on. What is Azure Local and the Sovereign Private Cloud Let us level-set on the stack, from the metal up. Hardware. Validated and certified through the Azure Local solution catalog, delivered by the OEMs you already buy from. Form factors range from single-node edge boxes up to multi-rack deployments. There is a Premier tier with extra testing, packaged firmware and driver updates, and AI-ready GPU configurations done with NVIDIA. Software-defined data center. Compute, storage, networking, and high availability. As of April 2026, supported SAN storage is GA alongside the existing hyperconverged storage spaces direct model. That gets you up to 64 nodes in disaggregated mode and 16 nodes in hyperconverged mode per instance. Workload plane. Linux and Windows VMs, custom images, your own Kubernetes distribution, or AKS enabled by Arc with the same management experience you have in Azure today. Arc-enabled control plane. This is where Azure Local stops being “another on-prem stack” and starts feeling like Azure. Defender, Azure Monitor, Azure Update Manager, Policy, RBAC, Resource Manager, all of it surfaces against your on-prem instance. Disconnected operations. Microsoft packaged a subset of the control plane (portal, Resource Manager, key management services) into an appliance you deploy on-premises. Connect your Azure Local infrastructure to the local appliance instead of public Azure, and you have a fully air-gapped deployment with a familiar API surface. On top of that base, the Sovereign Private Cloud bundles workloads you can run locally: Foundry Local for AI inferencing, Microsoft 365 Local (Exchange Server, SharePoint Server, Skype for Business Server) for productivity fallback, Azure Virtual Desktop on Azure Local for VDI, and GitHub Enterprise Local (in private preview at the time of the session) for source and CI/CD. How it works in production In the demo, Thomas drove the whole show from the Azure Arc Center in the Azure portal. A few things stood out for me as someone who has spent too many late nights patching clusters. One pane, many sites. The overview page rolls up every Azure Local instance you own. Thomas mentioned customers running thousands of these things, and the Azure Local Lens workbook in Azure Monitor is built to manage at that scale. Resources feel like Azure resources. An instance, a node, a VM, an AKS cluster, they all live inside Azure Resource Manager. RBAC, activity logs, tags, ARM templates, everything you expect. Update is a single button. The Solution Builder Extension packages OS, management software, drivers, and firmware into one validated update. You hit “update,” it orchestrates live migrations node by node, and it blocks the operation if something is not ready. No more cherry-picking driver bundles at 2 AM. Security defaults are real. BitLocker on OS and data volumes, SMB signing, App Control on the hypervisor hosts, drift detection that flags configuration changes back to the portal. Resiliency is layered. Storage spaces direct two-way or three-way mirroring, rack-aware clustering, live migration for maintenance, and Azure Site Recovery for site-to-cloud replication (currently preview). Site-to-site ASR between two Azure Local instances is in development. Veeam, Rubrik, and Commvault all integrate for backup. In short, the boring operational moments are the ones that benefit the most. Patching, monitoring, identity, alerting, they collapse into the tools you already use in Azure. When to use it and real-world scenarios This is not a “rip everything out of Azure” pitch. Thomas was very honest. Azure is still the right home for the vast majority of workloads. Azure Local earns its keep in a few specific places. Regulated or sovereign workloads. Government, defense, financial services, healthcare where the law or the contract says the data stays put. Disconnected or air-gapped sites. Field operations, classified networks, ships, mines, remote infrastructure where reliable connectivity is not in scope. Business continuity for productivity. Microsoft 365 Local as a fallback for Exchange and SharePoint if the cloud service is unreachable. From the session Q&A, M365 Local is GA, and it is the Exchange / SharePoint / Skype for Business trio. Entra ID and Intune are not in scope of the local bundle. Edge and latency-bound workloads. Manufacturing line control, retail in-store inference, healthcare imaging, anywhere a 30-millisecond round trip is a problem. Sovereign AI. Foundry Local on Azure Local lets you serve models on local GPUs without round-tripping to the cloud. Models stay local, data stays local, inference stays fast. Bi-directional workload mobility. With Sovereign Private Landing Zones, you design once and keep workloads portable between Azure and Azure Local based on a service-compatible subset. Getting Started If you are picking this up cold, here is a sensible on-ramp: Start with the official docs on Sovereign Private Cloud and Azure Local. Read them with your architect hat on, not just your operator hat. Design matters here. Browse the Azure Local solution catalog and filter by Premier solutions and by your target scenario (disconnected operations, M365 Local, AI workloads, GPU support). The hardware shape drives a lot of downstream decisions. Talk to your OEM about a validated node, and talk to your Microsoft account team or a sovereign partner. The partner ecosystem in this space is mature, and they will save you weeks. Stand up a small connected instance first to learn the Arc Center experience, the update flow, and Azure Monitor integration. Even a one-node or two-node lab is enough to internalize the model. For disconnected, size for the extra capacity the control plane appliance needs, plan your local identity (Active Directory with AD FS) and your local monitoring integration up front. If you live in Azure today and need workload portability, look at Sovereign Private Landing Zones so you do not paint yourself into a corner with services that have no on-prem equivalent. Resources What is Sovereign Private Cloud? on Microsoft Learn Azure Local documentation Disconnected operations for Azure Local Azure Arc product page Azure Site Recovery product page Foundry Local documentation on Microsoft Learn Foundry Local on GitHub Sovereign Landing Zones on GitHub Watch the rest of the Summit This was just one of the sessions at the Microsoft Azure Infra Summit 2026. If you want more peer-to-peer technical content from the Azure infrastructure community, grab a coffee and queue up the full playlist here: https://aka.ms/MAIS/2026-Playlist There is plenty of good stuff covering Bicep, AKS networking, storage, IaC, and more. If you spin up an Azure Local instance after watching the session, or if you are already running one in anger, drop a comment and let me know how it goes. What works, what hurts, what you wish was better. That is how we all level up. Cheers! Pierre Roman211Views0likes0CommentsBoost performance with NFS nconnect on Azure NetApp Files datastores for Azure VMware Solution
Azure VMware Solution now supports nconnect=4 with Azure NetApp Files datastores each ESXi host opens up to four parallel TCP connections to a single NFS datastore, raising throughput and lowering latency under load. Supported on both Gen 1 and Gen 2 private clouds, it makes Azure VMware Solution with Azure NetApp Files an even stronger home for databases and other storage‑intensive workloads.138Views0likes0Comments