<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>Apps on Azure Blog articles</title>
    <link>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/bg-p/AppsonAzureBlog</link>
    <description>Apps on Azure Blog articles</description>
    <pubDate>Tue, 21 Apr 2026 03:56:59 GMT</pubDate>
    <dc:creator>AppsonAzureBlog</dc:creator>
    <dc:date>2026-04-21T03:56:59Z</dc:date>
    <item>
      <title>AKS App Routing's Next Chapter: Gateway API with Istio</title>
      <link>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/aks-app-routing-s-next-chapter-gateway-api-with-istio/ba-p/4512729</link>
      <description>&lt;P&gt;If you've been following my previous posts on the Ingress NGINX retirement, you'll know the story so far. The community Ingress NGINX project was retired in March 2026, and Microsoft's extended support for the NGINX-based App Routing add-on runs until November 2026. I've covered &lt;A href="https://techcommunity.microsoft.com/blog/appsonazureblog/seamless-migrations-from-self-hosted-nginx-ingress-to-the-aks-app-routing-add-on/4495630" aria-label="https://techcommunity.microsoft.com/blog/appsonazureblog/seamless-migrations-from-self-hosted-nginx-ingress-to-the-aks-app-routing-add-on/4495630" data-tooltip-position="top" target="_blank"&gt;migrating from standalone NGINX to the App Routing add-on&lt;/A&gt; to buy time, and &lt;A href="https://techcommunity.microsoft.com/blog/appsonazureblog/after-ingress-nginx-migrating-to-application-gateway-for-containers/4503110" aria-label="https://techcommunity.microsoft.com/blog/appsonazureblog/after-ingress-nginx-migrating-to-application-gateway-for-containers/4503110" data-tooltip-position="top" target="_blank"&gt;migrating to Application Gateway for Containers&lt;/A&gt; as a long-term option. In both of those posts I mentioned that Microsoft was working on a new version of the App Routing add-on based on Istio and the Gateway API. Well, it's here, in preview at least.&lt;/P&gt;
&lt;P&gt;The &lt;A href="https://blog.aks.azure.com/2026/03/18/app-routing-gateway-api" aria-label="https://blog.aks.azure.com/2026/03/18/app-routing-gateway-api" data-tooltip-position="top" target="_blank"&gt;App Routing Gateway API implementation&lt;/A&gt; is Microsoft's recommended migration path for anyone currently using the NGINX-based App Routing add-on. It moves you off NGINX entirely and onto the Kubernetes Gateway API, with a lightweight Istio control plane handling the gateway infrastructure under the hood. Let's look at what this actually is, how it differs from other options, and how to migrate from both standalone NGINX and the existing App Routing add-on.&lt;/P&gt;
&lt;H2 data-heading="What Is It?"&gt;What Is It?&lt;/H2&gt;
&lt;P&gt;The new App Routing mode uses the Kubernetes Gateway API instead of the Ingress API. When you enable the add-on, AKS deploys an Istio control plane (istiod) to manage Envoy-based gateway proxies. The important thing to understand here is that this is &lt;EM&gt;not&lt;/EM&gt; the full Istio service mesh. There's no sidecar injection, no Istio CRDs installed for your workloads. It's Istio doing one specific job: managing gateway proxies for ingress traffic.&lt;/P&gt;
&lt;P&gt;When you create a Gateway resource, AKS provisions an Envoy Deployment, a LoadBalancer Service, a HorizontalPodAutoscaler (defaulting to 2-5 replicas at 80% CPU), and a PodDisruptionBudget. All managed. You write Gateway and HTTPRoute resources, and AKS handles everything else.&lt;/P&gt;
&lt;P&gt;This is a fundamentally different API from what you're used to with Ingress. Instead of a single Ingress resource that combines the entry point and routing rules, Gateway API splits things into layers:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;GatewayClass&lt;/STRONG&gt; defines the type of gateway infrastructure (provided by AKS in this case)&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Gateway&lt;/STRONG&gt; creates the actual gateway with its listeners&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;HTTPRoute&lt;/STRONG&gt; defines the routing rules and attaches to a Gateway&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;This separation is one of Gateway API's main selling points. Platform teams can own the Gateway resources while application teams manage their own HTTPRoutes independently, without needing to modify shared infrastructure. If you've ever had a team accidentally break routing for everyone by editing a shared Ingress, you'll appreciate why this matters.&lt;/P&gt;
&lt;H2 data-heading="How It Differs From the Istio Service Mesh Add-On"&gt;How It Differs From the Istio Service Mesh Add-On&lt;/H2&gt;
&lt;P&gt;If you're already running or considering the Istio service mesh add-on for AKS, this is a different thing.&lt;/P&gt;
&lt;P&gt;The App Routing Gateway API mode uses the approuting-istio GatewayClass, doesn't install Istio CRDs, doesn't enable sidecar injection, and handles upgrades in-place. The full Istio service mesh add-on uses the istio GatewayClass, installs Istio CRDs cluster-wide, enables sidecar injection, and uses canary upgrades for minor versions.&lt;/P&gt;
&lt;P&gt;The two cannot run at the same time. If you have the Istio service mesh add-on enabled, you need to disable it before enabling App Routing Gateway API (and vice versa). If you need full mesh capabilities like mTLS between services, traffic policies, and telemetry, stick with the Istio service mesh add-on. If you just need managed ingress via Gateway API without the mesh overhead, this is the right choice.&lt;/P&gt;
&lt;H2 data-heading="Current Limitations"&gt;Current Limitations&lt;/H2&gt;
&lt;P&gt;The new App Routing solution is in preview, so should not be run in production yet. There are also some gaps compared to the existing add-on, which you need to be aware of before planning a production migration.&lt;/P&gt;
&lt;P&gt;The biggest one: DNS and TLS certificate management via the add-on isn't supported yet for Gateway API. If you're currently using az aks approuting update and az aks approuting zone add to automate Key Vault and Azure DNS integration with the NGINX-based add-on, that workflow doesn't carry over. TLS termination is still possible, but you'll need to set it up manually. The &lt;A href="https://learn.microsoft.com/azure/aks/app-routing-gateway-api-tls" aria-label="https://learn.microsoft.com/azure/aks/app-routing-gateway-api-tls" data-tooltip-position="top" target="_blank"&gt;AKS docs cover the steps&lt;/A&gt;, but it's more hands-on than what the NGINX add-on gives you today. This is expected to be addressed when the feature reaches GA.&lt;/P&gt;
&lt;P&gt;SNI passthrough (TLSRoute) and egress traffic management aren't supported either. And as mentioned, it's mutually exclusive with the Istio service mesh add-on.&lt;/P&gt;
&lt;P&gt;For production workloads that depend heavily on automated DNS and TLS management, you may want to wait until GA, or look at &lt;A href="https://techcommunity.microsoft.com/blog/appsonazureblog/after-ingress-nginx-migrating-to-application-gateway-for-containers/4503110" aria-label="https://techcommunity.microsoft.com/blog/appsonazureblog/after-ingress-nginx-migrating-to-application-gateway-for-containers/4503110" data-tooltip-position="top" target="_blank"&gt;Application Gateway for Containers&lt;/A&gt; as an alternative. But for teams that can handle TLS setup manually, for non-production environments, there's no reason not to start testing this now.&lt;/P&gt;
&lt;H2 data-heading="Getting Started"&gt;Getting Started&lt;/H2&gt;
&lt;P&gt;Before you can enable the feature, you need the aks-preview CLI extension (version 19.0.0b24 or later), the Managed Gateway API CRDs enabled, and the App Routing Gateway API preview feature flag registered:&lt;/P&gt;
&lt;LI-CODE lang="bash"&gt;az extension add --name aks-preview
az extension update --name aks-preview

# Managed Gateway API CRDs (required dependency)
az feature register --namespace "Microsoft.ContainerService" --name "ManagedGatewayAPIPreview"

# App Routing Gateway API implementation
az feature register --namespace "Microsoft.ContainerService" --name "AppRoutingIstioGatewayAPIPreview"&lt;/LI-CODE&gt;
&lt;P&gt;Feature flag registration can take a few minutes. Once they're registered, enable the add-on on a new or existing cluster. You need both --enable-gateway-api (for the managed Gateway API CRD installation) and --enable-app-routing-istio (for the Istio-based implementation):&lt;/P&gt;
&lt;LI-CODE lang="bash"&gt;# New cluster
az aks create \
  --resource-group ${RESOURCE_GROUP} \
  --name ${CLUSTER} \
  --location swedencentral \
  --enable-gateway-api \
  --enable-app-routing-istio

# Existing cluster
az aks update \
  --resource-group ${RESOURCE_GROUP} \
  --name ${CLUSTER} \
  --enable-gateway-api \
  --enable-app-routing-istio&lt;/LI-CODE&gt;
&lt;P&gt;Verify istiod is running:&lt;/P&gt;
&lt;LI-CODE lang="bash"&gt;kubectl get pods -n aks-istio-system&lt;/LI-CODE&gt;
&lt;P&gt;You should see two istiod pods in a Running state.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;From here, you can create a Gateway and HTTPRoute to test traffic flow. The&amp;nbsp;&lt;A href="https://learn.microsoft.com/azure/aks/app-routing-gateway-api" aria-label="https://learn.microsoft.com/azure/aks/app-routing-gateway-api" data-tooltip-position="top" target="_blank"&gt;AKS quickstart&lt;/A&gt; walks through this with the httpbin sample app if you want a quick validation.&lt;/P&gt;
&lt;H2 data-heading="Migrating From NGINX Ingress"&gt;Migrating From NGINX Ingress&lt;/H2&gt;
&lt;P&gt;Whether you're running standalone NGINX (self-installed via Helm) or the NGINX-based App Routing add-on, the migration process is essentially the same. You're moving from Ingress API resources to Gateway API resources, and the new controller runs alongside your existing one during the transition. The only real differences are what you're cleaning up at the end and, if you're on the App Routing add-on, whether you were relying on its built-in DNS and TLS automation.&lt;/P&gt;
&lt;H3 data-heading="Inventory Your Ingress Resources"&gt;Inventory Your Ingress Resources&lt;/H3&gt;
&lt;P&gt;Before anything else, understand what you have:&lt;/P&gt;
&lt;LI-CODE lang="bash"&gt;kubectl get ingress --all-namespaces \
  -o custom-columns='NAMESPACE:.metadata.namespace,NAME:.metadata.name,CLASS:.spec.ingressClassName'&lt;/LI-CODE&gt;
&lt;P&gt;Look specifically for custom snippets, lua configurations, or anything that relies heavily on NGINX-specific behaviour. These won't have direct equivalents in Gateway API and will need manual attention.&lt;/P&gt;
&lt;H3 data-heading="Convert Ingress Resources to Gateway API"&gt;Convert Ingress Resources to Gateway API&lt;/H3&gt;
&lt;P&gt;The &lt;A href="https://github.com/kubernetes-sigs/ingress2gateway" aria-label="https://github.com/kubernetes-sigs/ingress2gateway" data-tooltip-position="top" target="_blank"&gt;ingress2gateway&lt;/A&gt; tool (v1.0.0) handles conversion of Ingress resources to Gateway API equivalents. It supports over 30 common NGINX annotations and generates Gateway and HTTPRoute YAML. It works regardless of whether your Ingress resources use the nginx or webapprouting.kubernetes.azure.com IngressClass:&lt;/P&gt;
&lt;LI-CODE lang="bash"&gt;# Install
go install github.com/kubernetes-sigs/ingress2gateway@v1.0.0

# Convert from live cluster
ingress2gateway print --providers=ingress-nginx -A &amp;gt; gateway-resources.yaml

# Or convert from a local file
ingress2gateway print --providers=ingress-nginx --input-file=./manifests/ingress.yaml &amp;gt; gateway-resources.yaml&lt;/LI-CODE&gt;
&lt;P&gt;Review the output carefully. The tool flags annotations it can't convert as comments in the generated YAML, so you'll know exactly what needs manual work. Common gaps include custom configuration snippets and regex-based rewrites that don't map cleanly to Gateway API's routing model.&lt;/P&gt;
&lt;P&gt;Make sure you update the gatewayClassName in the generated Gateway resources to approuting-istio. The tool may generate a generic GatewayClass name that you'll need to change.&lt;/P&gt;
&lt;H3 data-heading="Handle DNS and TLS"&gt;Handle DNS and TLS&lt;/H3&gt;
&lt;P&gt;If you're coming from standalone NGINX, you're likely managing DNS and TLS yourself already, so nothing changes here: just make sure your certificate Secrets and DNS records are ready for the new Gateway IP.&lt;/P&gt;
&lt;P&gt;If you're coming from the App Routing add-on and relying on its built-in DNS and TLS management (via az aks approuting zone add and Key Vault integration), this is the part that needs extra thought. That automation doesn't carry over to the Gateway API implementation yet, so you'll need to handle it differently until GA.&lt;/P&gt;
&lt;P&gt;For TLS, you can either create Kubernetes Secrets with your certificates manually or set up a workflow to sync them from Key Vault. The &lt;A href="https://learn.microsoft.com/azure/aks/app-routing-gateway-api-tls" aria-label="https://learn.microsoft.com/azure/aks/app-routing-gateway-api-tls" data-tooltip-position="top" target="_blank"&gt;AKS docs on securing Gateway API traffic&lt;/A&gt; cover the manual approach. For DNS, you'll need to manage records yourself or use &lt;A href="https://github.com/kubernetes-sigs/external-dns" aria-label="https://github.com/kubernetes-sigs/external-dns" data-tooltip-position="top" target="_blank"&gt;ExternalDNS&lt;/A&gt; to automate it. ExternalDNS supports Gateway API resources, so this is a viable path if you want automation.&lt;/P&gt;
&lt;H3 data-heading="Deploy and Validate"&gt;Deploy and Validate&lt;/H3&gt;
&lt;P&gt;With the add-on enabled, apply your converted resources:&lt;/P&gt;
&lt;LI-CODE lang="bash"&gt;kubectl apply -f gateway-resources.yaml&lt;/LI-CODE&gt;
&lt;P&gt;Wait for the Gateway to be programmed and get the external IP:&lt;/P&gt;
&lt;LI-CODE lang="bash"&gt;kubectl wait --for=condition=programmed gateways.gateway.networking.k8s.io &amp;lt;gateway-name&amp;gt;
export GATEWAY_IP=$(kubectl get gateways.gateway.networking.k8s.io &amp;lt;gateway-name&amp;gt; -ojsonpath='{.status.addresses[0].value}')&lt;/LI-CODE&gt;
&lt;P&gt;The key thing here is that your existing NGINX controller (whether standalone or add-on managed) is still running and serving production traffic. The Gateway API resources are handled separately by the Istio-based controller in aks-istio-system. This parallel running is what makes the migration safe.&lt;/P&gt;
&lt;P&gt;Test your routes against the new Gateway IP, you'll need to provide the appropriate URL as a host header, as your DNS will still be pointing at the NGINX Add-On at this point.&lt;/P&gt;
&lt;LI-CODE lang="bash"&gt;curl -H "Host: myapp.example.com" http://$GATEWAY_IP&lt;/LI-CODE&gt;
&lt;P&gt;Run your full validation suite. Check TLS, path routing, headers, authentication, anything your applications depend on. Take your time here; nothing changes for production until you update DNS.&lt;/P&gt;
&lt;H3 data-heading="Cut Over DNS and Clean Up"&gt;Cut Over DNS and Clean Up&lt;/H3&gt;
&lt;P&gt;Once you're confident, lower your DNS TTL to 60 seconds (do this well in advance), then update your DNS records to point to the new Gateway IP. Keep the old NGINX controller running for 24-48 hours as a rollback option.&lt;/P&gt;
&lt;P&gt;After traffic has been flowing cleanly through the Gateway API path, clean up the old setup. What this looks like depends on where you started:&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;If you were on standalone NGINX:&lt;/STRONG&gt;&lt;/P&gt;
&lt;LI-CODE lang="bash"&gt;helm uninstall ingress-nginx -n ingress-nginx
kubectl delete namespace ingress-nginx&lt;/LI-CODE&gt;
&lt;P&gt;&lt;STRONG&gt;If you were on the App Routing add-on with NGINX:&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Verify nothing is still using the old IngressClass:&lt;/P&gt;
&lt;LI-CODE lang="bash"&gt;kubectl get ingress --all-namespaces \
  -o custom-columns='NAMESPACE:.metadata.namespace,NAME:.metadata.name,CLASS:.spec.ingressClassName' \
  | grep "webapprouting"&lt;/LI-CODE&gt;
&lt;P&gt;Delete any remaining Ingress resources that reference the old class, then disable the NGINX-based App Routing add-on:&lt;/P&gt;
&lt;LI-CODE lang="bash"&gt;az aks approuting disable --resource-group ${RESOURCE_GROUP} --name ${CLUSTER}&lt;/LI-CODE&gt;
&lt;P&gt;Some resources (configMaps, secrets, and the controller deployment) will remain in the app-routing-system namespace after disabling. You can clean these up by deleting the namespace once you're satisfied everything is running through the Gateway API path:&lt;/P&gt;
&lt;LI-CODE lang="bash"&gt;kubectl delete ns app-routing-system&lt;/LI-CODE&gt;
&lt;P&gt;In both cases, clean up any old Ingress resources that are no longer being used.&lt;/P&gt;
&lt;H2 data-heading="Upgrades and Lifecycle"&gt;Upgrades and Lifecycle&lt;/H2&gt;
&lt;P&gt;The Istio control plane version is tied to your AKS cluster's Kubernetes version. AKS automatically handles patch upgrades as part of its release cycle, and minor version upgrades happen in-place when you upgrade your cluster's Kubernetes version or when a new Istio minor version is released for your AKS version.&lt;/P&gt;
&lt;P&gt;One thing to be aware of - unlike the Istio service mesh add-on, upgrades here are in-place, not canary-based. The HPA and PDB on each Gateway help minimise disruption, but plan accordingly for production. If you have &lt;A href="https://learn.microsoft.com/azure/aks/planned-maintenance" aria-label="https://learn.microsoft.com/azure/aks/planned-maintenance" data-tooltip-position="top" target="_blank"&gt;maintenance windows&lt;/A&gt; configured, the istiod upgrades will respect them.&lt;/P&gt;
&lt;H2 data-heading="What Should You Do Now?"&gt;What Should You Do Now?&lt;/H2&gt;
&lt;P&gt;The timeline hasn't changed. The standalone NGINX Ingress project was retired in March 2026, so if you're still running that, you're already on unsupported software. The NGINX App Routing add-on is supported until November 2026, which gives you a window, but it's not a long one.&lt;/P&gt;
&lt;P&gt;If you're on standalone NGINX you could get onto the App Routing add-on now to buy time (I covered this in my &lt;A href="https://techcommunity.microsoft.com/blog/appsonazureblog/seamless-migrations-from-self-hosted-nginx-ingress-to-the-aks-app-routing-add-on/4495630" aria-label="https://techcommunity.microsoft.com/blog/appsonazureblog/seamless-migrations-from-self-hosted-nginx-ingress-to-the-aks-app-routing-add-on/4495630" data-tooltip-position="top" target="_blank"&gt;earlier post&lt;/A&gt;), then plan your migration to either the Gateway API mode or AGC.&lt;/P&gt;
&lt;P&gt;If you're on the NGINX App Routing add-on: start testing the Gateway API mode in non-production now. Get familiar with the Gateway API resource model, understand the TLS and DNS gaps in the preview, and be ready to migrate when the feature reaches GA or when November gets close, whichever comes first.&lt;/P&gt;
&lt;P&gt;If you need production-ready TLS and DNS automation today and can't wait for GA, App Gateway for Containers is your best option right now.&lt;/P&gt;
&lt;P&gt;Whatever path you choose, make sure you have a plan in place before November. Running unsupported ingress software on production infrastructure isn't where you want to be.&lt;/P&gt;</description>
      <pubDate>Sun, 19 Apr 2026 18:25:14 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/aks-app-routing-s-next-chapter-gateway-api-with-istio/ba-p/4512729</guid>
      <dc:creator>samcogan</dc:creator>
      <dc:date>2026-04-19T18:25:14Z</dc:date>
    </item>
    <item>
      <title>Autonomous AKS Incident Response with Azure SRE Agent: From Alert to Verified Recovery in Minutes</title>
      <link>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/autonomous-aks-incident-response-with-azure-sre-agent-from-alert/ba-p/4511343</link>
      <description>&lt;P&gt;When a Sev1 alert fires on an AKS cluster, detection is rarely the hard part. The hard part is what comes next: proving what broke, why it broke, and fixing it without widening the blast radius, all under time pressure, often at 2 a.m.&lt;/P&gt;
&lt;P&gt;Azure SRE Agent is designed to close that gap. It connects Azure-native observability, AKS diagnostics, and engineering workflows into a single incident-response loop that can investigate, remediate, verify, and follow up, without waiting for a human to page through dashboards and run ad-hoc&amp;nbsp;&lt;EM&gt;kubectl&lt;/EM&gt;&amp;nbsp;commands.&lt;/P&gt;
&lt;P&gt;This post walks through that loop in two real AKS failure scenarios. In both cases, the agent received an incident, investigated Azure Monitor and AKS signals, applied targeted remediation, verified recovery, and created follow-up in GitHub, all while keeping the team informed in Microsoft Teams.&lt;/P&gt;
&lt;H2&gt;Core concepts&lt;/H2&gt;
&lt;P&gt;Azure SRE Agent is a governed incident-response system, not a conversational assistant with infrastructure access. Five concepts matter most in an AKS incident workflow:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;&lt;STRONG&gt;Incident platform.&lt;/STRONG&gt;&amp;nbsp;Where incidents originate. In this demo, that is Azure Monitor.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Built-in Azure capabilities.&lt;/STRONG&gt;&amp;nbsp;The agent uses Azure Monitor, Log Analytics, Azure Resource Graph, Azure CLI/ARM, and AKS diagnostics without requiring external connectors.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Connectors.&lt;/STRONG&gt;&amp;nbsp;Extend the workflow to systems such as GitHub, Teams, Kusto, and MCP servers.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Permission levels.&lt;/STRONG&gt; &lt;EM&gt;Reader &lt;/EM&gt;for investigation and read oriented access, &lt;EM&gt;privileged &lt;/EM&gt;for operational changes when allowed.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Run modes.&lt;/STRONG&gt;&amp;nbsp;&lt;EM&gt;Review&amp;nbsp;&lt;/EM&gt;for approval-gated execution and&amp;nbsp;&lt;EM&gt;Autonomous&amp;nbsp;&lt;/EM&gt;for direct execution.&lt;/LI&gt;
&lt;/OL&gt;
&lt;BLOCKQUOTE&gt;
&lt;P class=""&gt;&lt;STRONG&gt;The most important production controls are permission level and run mode, not prompt quality.&lt;/STRONG&gt;&amp;nbsp;Custom instructions can shape workflow behavior, but they do not replace RBAC, telemetry quality, or tool availability.&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P&gt;The safest production rollout path:&lt;/P&gt;
&lt;LI-CODE lang=""&gt;Start: Reader + Review 
Then: Privileged + Review 
Finally: Privileged + Autonomous. Only for narrow, trusted incident paths.&lt;/LI-CODE&gt;
&lt;H2&gt;Demo environment&lt;/H2&gt;
&lt;P&gt;The full scripts and manifests are available if you want to reproduce this:&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P class=""&gt;&lt;STRONG&gt;Demo repository:&lt;/STRONG&gt;&amp;nbsp;&lt;A href="https://github.com/hailugebru/azure-sre-agents-aks" target="_blank" rel="noopener"&gt;github.com/hailugebru/azure-sre-agents-aks&lt;/A&gt;. The README includes setup and configuration details.&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P&gt;The environment uses an AKS cluster with &lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/azure/aks/node-auto-provisioning" target="_blank" rel="noopener"&gt;node auto-provisioning&lt;/A&gt; (NAP), Azure CNI Overlay powered by Cilium, managed Prometheus metrics, the &lt;A class="lia-external-url" href="https://github.com/Azure-Samples/aks-store-demo" target="_blank" rel="noopener"&gt;AKS Store sample&lt;/A&gt; microservices application, and &lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/azure/sre-agent/overview?tabs=task" target="_blank" rel="noopener"&gt;Azure SRE Agent&lt;/A&gt; configured for incident-triggered investigation and remediation. This setup is intentionally realistic but minimal. It provides enough surface area to exercise real AKS failure modes without distracting from the incident workflow itself.&lt;/P&gt;
&lt;LI-CODE lang=""&gt;Azure Monitor  →  Action Group  →  Azure SRE Agent  →  AKS Cluster
(Alert)           (Webhook)      (Investigate / Fix)    (Recover)

                                      ↓
               Teams notification + GitHub issue → GitHub Agent → PR for review&lt;/LI-CODE&gt;
&lt;H3&gt;How the agent was configured&lt;/H3&gt;
&lt;P&gt;Configuration came down to four things: scope, permissions, incident intake, and response mode. I scoped the agent to the demo resource group and used its user-assigned managed identity (UAMI) for Azure access. That scope defined what the agent could investigate, while RBAC determined what actions it could take.&lt;/P&gt;
&lt;P&gt;I used broader AKS permissions than I would recommend as a default production baseline so the agent could complete remediation end to end in the lab. That is an important distinction:&amp;nbsp;&lt;STRONG&gt;permissions control what the agent can access, while run mode controls whether it asks for approval or acts directly.&lt;/STRONG&gt;&amp;nbsp;For this scenario, Azure Monitor served as the incident platform, and I set the response plan to Autonomous for a narrow, trusted path so the workflow could run without manual approval gates.&lt;/P&gt;
&lt;P&gt;I also added Teams and GitHub integrations so the workflow could extend beyond Azure. Teams provided milestone updates during the incident, and GitHub provided durable follow up after remediation. For the complete setup, see the &lt;A class="lia-external-url" href="https://github.com/hailugebru/azure-sre-agents-aks/tree/main" target="_blank" rel="noopener"&gt;README&lt;/A&gt;.&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;STRONG&gt;A note on context.&lt;/STRONG&gt; The more context you can provide the agent about your environment, resources, runbooks, and conventions, the better it performs. Scope boundaries, known workloads, common failure patterns, and links to relevant documentation all sharpen its investigations and reduce the time it spends exploring. Treat custom instructions and connector content as first-class inputs, not afterthoughts.&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;H2&gt;Two incidents, two response modes&lt;/H2&gt;
&lt;P&gt;These incidents occurred on the same cluster in one session and illustrate two realistic operating modes:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;&lt;STRONG&gt;Alert triggered automation.&lt;/STRONG&gt;&amp;nbsp;The agent acts when Azure Monitor fires.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Ad hoc chat investigation.&lt;/STRONG&gt;&amp;nbsp;An engineer sees a symptom first and asks the agent to investigate.&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;Both matter in real environments. The first is your scale path. The second is your operator assist path.&lt;/P&gt;
&lt;H3&gt;Incident 1. CPU starvation (alert driven, ~8 min MTTR)&lt;/H3&gt;
&lt;P&gt;The&amp;nbsp;&lt;EM&gt;makeline-service&lt;/EM&gt;&amp;nbsp;deployment manifest contained a CPU and memory configuration that was not viable for startup:&lt;/P&gt;
&lt;LI-CODE lang=""&gt;resources:
  requests:
    cpu: 1m
    memory: 6Mi
  limits:
    cpu: 5m
    memory: 20Mi&lt;/LI-CODE&gt;
&lt;P&gt;Within five minutes, Azure Monitor fired the&amp;nbsp;&lt;EM&gt;pod-not-healthy&lt;/EM&gt;&amp;nbsp;Sev1 alert. The agent picked it up immediately.&lt;/P&gt;
&lt;P&gt;Here is the key diagnostic conclusion the agent reached from the pod state, probe behavior, and exit code:&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;EM&gt;"Exit code 1 (not 137) rules out OOMKill. The pod failed at startup, not at runtime memory pressure. CPU limit of 5m is insufficient for the process to bind its port before the startup probe times out. This is a configuration error, not a resource exhaustion scenario."&lt;/EM&gt;&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P&gt;That is the kind of distinction that often takes an on call engineer several minutes to prove under pressure: startup failure from CPU starvation vs. runtime termination from memory pressure.&lt;/P&gt;
&lt;P&gt;The agent then:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Identified three additional CPU-throttled pods at 112 to 200% of configured limit using&amp;nbsp;&lt;EM&gt;kubectl top&lt;/EM&gt;.&lt;/LI&gt;
&lt;LI&gt;Patched four workloads:&amp;nbsp;&lt;EM&gt;makeline-service&lt;/EM&gt;,&amp;nbsp;&lt;EM&gt;virtual-customer&lt;/EM&gt;,&amp;nbsp;&lt;EM&gt;virtual-worker&lt;/EM&gt;, and&amp;nbsp;&lt;EM&gt;mongodb&lt;/EM&gt;.&lt;/LI&gt;
&lt;LI&gt;Verified that all affected pods returned to healthy running state with 0 restarts cluster wide.&lt;/LI&gt;
&lt;/UL&gt;
&lt;img /&gt;
&lt;P class="lia-indent-padding-left-30px"&gt;&lt;EM&gt;Azure SRE Agent's Incident History blade confirming full cluster recovery: 4 patches applied, 0 unhealthy pods — no human intervention required.&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Outcome.&lt;/STRONG&gt;&amp;nbsp;Full cluster recovery in ~8 minutes, 0 human interventions.&lt;/P&gt;
&lt;H3&gt;Incident 2. OOMKilled (chat driven, ~4 min MTTR)&lt;/H3&gt;
&lt;P&gt;For the second case, I deployed a deliberately undersized version of &lt;EM&gt;order-service&lt;/EM&gt;:&lt;/P&gt;
&lt;LI-CODE lang=""&gt;kubectl apply -f .\manifests\aks-store\order-service-changed.yaml -n pets&lt;/LI-CODE&gt;
&lt;P&gt;I started this case from chat before the pod-phase alert fired to demonstrate the interactive troubleshooting flow. That was a demo choice, not an alerting gap. &lt;EM&gt;CrashLoopBackOff&lt;/EM&gt; is a container waiting reason, not a pod phase, so production coverage should come from Prometheus based crash-loop signals rather than pod phase alone. Here is the PromQL query I use in Azure Monitor to catch this class of failure:&lt;/P&gt;
&lt;LI-CODE lang=""&gt;sum by (namespace, pod) (
  (
    max_over_time(
      kube_pod_container_status_waiting_reason{
        namespace="pets",
        reason="CrashLoopBackOff"
      }[5m]
    ) == 1
  )
  and on (namespace, pod, container)
  (
    increase(
      kube_pod_container_status_restarts_total{
        namespace="pets"
      }[15m]
    ) &amp;gt; 0
  )
) &amp;gt; 0&lt;/LI-CODE&gt;
&lt;P&gt;This query fires when a container has been in&amp;nbsp;&lt;EM&gt;CrashLoopBackOff&lt;/EM&gt;&amp;nbsp;within the last 5 minutes&amp;nbsp;&lt;STRONG&gt;and&lt;/STRONG&gt; its restart count has increased in the last 15 minutes. In production, replace the hardcoded namespace with a regex matcher or remove it entirely to cover all namespaces.&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;The order-service pod in the pets namespace is not healthy. Please investigate, identify the root cause, and fix it.&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P&gt;The agent's reasoning:&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;EM&gt;"Container logs are empty. The process was killed before it could write its first log line. Exit code 137 confirms OOMKill. No NODE_OPTIONS in the ConfigMap rules out a V8 heap misconfiguration. The 20Mi limit is 12.8x below the pod's observed 50Mi runtime baseline. This limit was never viable for this workload."&lt;/EM&gt;&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P&gt;The agent increased the memory limit (&lt;EM&gt;20Mi &lt;/EM&gt;to &lt;EM&gt;128Mi&lt;/EM&gt;) and request (&lt;EM&gt;10Mi &lt;/EM&gt;to &lt;EM&gt;50Mi&lt;/EM&gt;), then verified the new pod stabilized at 74Mi/128Mi (58% utilization) with 0 restarts.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Outcome.&lt;/STRONG&gt;&amp;nbsp;Service recovered in ~4 minutes without any manual cluster interaction.&lt;/P&gt;
&lt;H3&gt;Side by side comparison&lt;/H3&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Dimension&lt;/th&gt;&lt;th&gt;Incident 1: CPU starvation&lt;/th&gt;&lt;th&gt;Incident 2: OOMKilled&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Trigger&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Azure Monitor alert (automated)&lt;/td&gt;&lt;td&gt;Engineer chat prompt (ad hoc)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Failure mode&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;CPU too low for startup probe to pass&lt;/td&gt;&lt;td&gt;Memory limit too low for process to start&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Key signal&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Exit code 1, probe timeout&lt;/td&gt;&lt;td&gt;Exit code 137, empty container logs&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Blast radius&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;4 workloads affected cluster wide&lt;/td&gt;&lt;td&gt;1 workload in target namespace&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Remediation&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;CPU request/limit patches across 4 deployments&lt;/td&gt;&lt;td&gt;Memory request/limit patch on 1 deployment&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;MTTR&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;~8 min&lt;/td&gt;&lt;td&gt;~4 min&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Human interventions&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;0&lt;/td&gt;&lt;td&gt;0&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 33.33%" /&gt;&lt;col style="width: 33.33%" /&gt;&lt;col style="width: 33.33%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;H2&gt;Why this matters&lt;/H2&gt;
&lt;P&gt;Most AKS environments already emit rich telemetry through Azure Monitor and managed Prometheus. What is still manual is the response: engineers paging through dashboards, running ad-hoc&amp;nbsp;&lt;EM&gt;kubectl&lt;/EM&gt;&amp;nbsp;commands, and applying hotfixes under time pressure. Azure SRE Agent changes that by turning repeatable investigation and remediation paths into an automated workflow.&lt;/P&gt;
&lt;P&gt;The value isn't just that the agent patched a CPU limit. It's that the investigation, remediation, and verification loop is the same regardless of failure mode, and it runs while your team sleeps.&lt;/P&gt;
&lt;P&gt;In this lab, the impact was measurable:&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="width: 57.8704%; height: 242px; border-width: 1px;"&gt;&lt;thead&gt;&lt;tr style="height: 34.5714px;"&gt;&lt;th style="height: 34.5714px;"&gt;Metric&lt;/th&gt;&lt;th style="height: 34.5714px;"&gt;This demo with Azure SRE Agent&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr style="height: 34.5714px;"&gt;&lt;td style="height: 34.5714px;"&gt;Alert to recovery&lt;/td&gt;&lt;td style="height: 34.5714px;"&gt;~4 to 8 min&lt;/td&gt;&lt;/tr&gt;&lt;tr style="height: 34.5714px;"&gt;&lt;td style="height: 34.5714px;"&gt;Human interventions&lt;/td&gt;&lt;td style="height: 34.5714px;"&gt;0&lt;/td&gt;&lt;/tr&gt;&lt;tr style="height: 34.5714px;"&gt;&lt;td style="height: 34.5714px;"&gt;Scope of investigation&lt;/td&gt;&lt;td style="height: 34.5714px;"&gt;Cluster wide, automated&lt;/td&gt;&lt;/tr&gt;&lt;tr style="height: 34.5714px;"&gt;&lt;td style="height: 34.5714px;"&gt;Correlate evidence and diagnose&lt;/td&gt;&lt;td style="height: 34.5714px;"&gt;~2 min&lt;/td&gt;&lt;/tr&gt;&lt;tr style="height: 34.5714px;"&gt;&lt;td style="height: 34.5714px;"&gt;Apply fix and verify&lt;/td&gt;&lt;td style="height: 34.5714px;"&gt;~4 min&lt;/td&gt;&lt;/tr&gt;&lt;tr style="height: 34.5714px;"&gt;&lt;td style="height: 34.5714px;"&gt;Post incident follow-up&lt;/td&gt;&lt;td style="height: 34.5714px;"&gt;GitHub issue + draft PR&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 39.3602%" /&gt;&lt;col style="width: 60.5671%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;EM&gt;These results came from a controlled run on April 10, 2026. Real world outcomes depend on alert quality, cluster size, and how much automation you enable. For reference, industry reports from PagerDuty and Datadog typically place manual Sev1 MTTR in the 30 to 120 minute range for Kubernetes environments.&lt;/EM&gt;&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;H2&gt;Teams + GitHub follow-up&lt;/H2&gt;
&lt;P&gt;Runtime remediation is only half the story. If the workflow ends when the pod becomes healthy again, the same issue returns on the next deployment. That is why the post incident path matters.&lt;/P&gt;
&lt;P&gt;After Incident 1 resolved, Azure SRE Agent used the&amp;nbsp;&lt;STRONG&gt;GitHub connector&lt;/STRONG&gt;&amp;nbsp;to file an issue with the incident summary, root cause, and runtime changes. In the demo, I assigned that issue to GitHub Copilot agent, which opened a draft pull request to align the source manifests with the hotfix. The agent can also be configured to submit the PR directly in the same workflow, not just open the issue, so the fix is in your review queue by the time anyone sees the notification. Human review still remains the final control point before merge. Setup details for the GitHub connector are in the&amp;nbsp;demo repo &lt;A class="lia-external-url" href="https://github.com/hailugebru/azure-sre-agents-aks/tree/main" target="_blank" rel="noopener"&gt;README&lt;/A&gt;, and the official reference is in the&amp;nbsp;&lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/azure/sre-agent/github-connector" target="_blank" rel="noopener"&gt;Azure SRE Agent docs&lt;/A&gt;.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Azure SRE Agent fixes the live issue, and the GitHub follow-up prepares the durable source change so future deployments do not reintroduce the same configuration problem.&lt;/STRONG&gt;&lt;/P&gt;
&lt;img /&gt;
&lt;P class="lia-indent-padding-left-30px"&gt;&lt;EM&gt;The operations to engineering handoff: Azure SRE Agent fixed the live cluster; GitHub Copilot agent prepares the durable source change so the same misconfiguration can't ship again.&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;In parallel, the Teams connector posted milestone updates during the incident:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Investigation started.&lt;/LI&gt;
&lt;LI&gt;Root cause and remediation identified.&lt;/LI&gt;
&lt;LI&gt;Incident resolved.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Teams handled real time situational awareness. GitHub handled durable engineering follow-up. Together, they closed the gap between operations and software delivery.&lt;/P&gt;
&lt;H2&gt;Key takeaways&lt;/H2&gt;
&lt;H3&gt;Three things to carry forward&lt;/H3&gt;
&lt;OL&gt;
&lt;LI&gt;&lt;STRONG&gt;Treat Azure SRE Agent as a governed incident response system, not a chatbot with infrastructure access.&lt;/STRONG&gt;&amp;nbsp;The most important controls are permission levels and run modes, not prompt quality.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Anchor detection in your existing incident platforms.&lt;/STRONG&gt; For this demo, we used Prometheus and Azure Monitor, but the pattern applies regardless of where your signals live. Use connectors to extend the workflow outward. Teams for real time coordination, GitHub for durable engineering follow-up.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Start where you're comfortable.&lt;/STRONG&gt;&amp;nbsp;If you are just getting your feet wet, begin with one resource group, one incident type, and&amp;nbsp;&lt;EM&gt;Review&amp;nbsp;&lt;/EM&gt;mode. Validate that telemetry flows, RBAC is scoped correctly, and your alert rules cover the failure modes you actually care about before enabling&amp;nbsp;&lt;EM&gt;Autonomous&lt;/EM&gt;. Expand only once each layer is trusted.&lt;/LI&gt;
&lt;/OL&gt;
&lt;H3&gt;Next steps&lt;/H3&gt;
&lt;UL&gt;
&lt;LI&gt;Add Prometheus based alert coverage for &lt;EM&gt;ImagePullBackOff&lt;/EM&gt; and node resource pressure to complement the pod phase rule.&lt;/LI&gt;
&lt;LI&gt;Expand to multi cluster managed scopes once the single cluster path is trusted and validated.&lt;/LI&gt;
&lt;LI&gt;Explore how NAP and Azure SRE Agent complement each other — NAP manages infrastructure capacity, while the agent investigates and remediates incidents.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;I'd like to thank&amp;nbsp;&lt;STRONG&gt;Cary Chai&lt;/STRONG&gt;, Senior Product Manager for Azure SRE Agent, for his early technical guidance and thorough review — his feedback sharpened both the accuracy and quality of this post.&lt;/P&gt;</description>
      <pubDate>Mon, 20 Apr 2026 18:07:34 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/autonomous-aks-incident-response-with-azure-sre-agent-from-alert/ba-p/4511343</guid>
      <dc:creator>hailukassa</dc:creator>
      <dc:date>2026-04-20T18:07:34Z</dc:date>
    </item>
    <item>
      <title>New in Azure SRE Agent: Log Analytics and Application Insights Connectors</title>
      <link>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/new-in-azure-sre-agent-log-analytics-and-application-insights/ba-p/4509649</link>
      <description>&lt;P data-line="2"&gt;Azure SRE Agent now supports&amp;nbsp;&lt;STRONG&gt;Log Analytics&lt;/STRONG&gt;&amp;nbsp;and&amp;nbsp;&lt;STRONG&gt;Application Insights&lt;/STRONG&gt;&amp;nbsp;as log providers, backed by the&amp;nbsp;&lt;A href="https://github.com/Azure/azure-mcp" target="_blank" rel="noopener" data-href="https://github.com/Azure/azure-mcp"&gt;Azure MCP Server&lt;/A&gt;. Connect your workspaces and App Insights resources, and the agent can query them directly during investigations.&lt;/P&gt;
&lt;H2 data-line="6"&gt;Why This Matters&lt;/H2&gt;
&lt;P data-line="8"&gt;Log Analytics and Application Insights are common destinations for Azure operational data - container logs, application traces, dependency failures, security events. The agent could already access this data through az monitor CLI commands if you granted RBAC roles to its managed identity, and that approach still works. But it required manual RBAC setup and the agent had to shell out to CLI for every query.&lt;/P&gt;
&lt;P data-line="10"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P data-line="10"&gt;With these connectors, setup is simpler and querying is faster. You pick a workspace, we handle the RBAC grants, and the agent gets native MCP-backed query tools instead of going through CLI.&lt;/P&gt;
&lt;H2 data-line="12"&gt;What You Get&lt;/H2&gt;
&lt;P data-line="14"&gt;&lt;STRONG&gt;Two new connector types&lt;/STRONG&gt;&amp;nbsp;in Builder &amp;gt; Connectors (or through the onboarding flow under Logs):&lt;/P&gt;
&lt;UL data-line="16"&gt;
&lt;LI data-line="16"&gt;&lt;STRONG&gt;Log Analytics&lt;/STRONG&gt;&amp;nbsp;- connect a workspace. The agent can query &lt;STRONG&gt;ContainerLog&lt;/STRONG&gt;, &lt;STRONG&gt;Syslog&lt;/STRONG&gt;, &lt;STRONG&gt;AzureDiagnostics&lt;/STRONG&gt;, &lt;STRONG&gt;KubeEvents&lt;/STRONG&gt;, &lt;STRONG&gt;SecurityEvent&lt;/STRONG&gt;, custom tables, anything in that workspace.&lt;/LI&gt;
&lt;LI data-line="17"&gt;&lt;STRONG&gt;Application Insights&lt;/STRONG&gt;&amp;nbsp;- connect an App Insights resource. The agent gets access to requests, dependencies, exceptions, traces, and custom telemetry.&lt;/LI&gt;
&lt;/UL&gt;
&lt;img&gt;New Connectors during onboarding.&lt;/img&gt;
&lt;P data-line="21"&gt;You can connect multiple workspaces and App Insights resources. The agent knows which ones are available and targets the right one based on the investigation.&lt;/P&gt;
&lt;H2 data-line="23"&gt;Setup&lt;/H2&gt;
&lt;P&gt;If you want early access, please enable: &lt;STRONG&gt;Early access to features&lt;/STRONG&gt; under Settings &amp;gt; Basics.&lt;/P&gt;
&lt;img&gt;Early access to features&lt;/img&gt;
&lt;P data-line="27"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P data-line="27"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P data-line="27"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P data-line="27"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P data-line="27"&gt;From there you can add connectors in two ways:&lt;/P&gt;
&lt;P data-line="29"&gt;&lt;STRONG&gt;Through onboarding:&lt;/STRONG&gt;&amp;nbsp;Click&amp;nbsp;&lt;STRONG&gt;Logs&lt;/STRONG&gt;&amp;nbsp;in the onboarding flow, then select&amp;nbsp;&lt;STRONG&gt;Log Analytics Workspace&lt;/STRONG&gt;&amp;nbsp;or&amp;nbsp;&lt;STRONG&gt;Application Insights&lt;/STRONG&gt;&amp;nbsp;under Additional connectors.&lt;/P&gt;
&lt;P data-line="33"&gt;&lt;STRONG&gt;Through Builder:&lt;/STRONG&gt;&amp;nbsp;Go to&amp;nbsp;&lt;STRONG&gt;Builder &amp;gt; Connectors&lt;/STRONG&gt;&amp;nbsp;in the sidebar and add a&amp;nbsp;&lt;STRONG&gt;Log Analytics&lt;/STRONG&gt;&amp;nbsp;or&amp;nbsp;&lt;STRONG&gt;Application Insights&lt;/STRONG&gt;&amp;nbsp;connector.&lt;/P&gt;
&lt;P data-line="37"&gt;Pick your resource from the dropdown and save. If discovery doesn't find your resource, both connector types have a manual entry fallback.&lt;/P&gt;
&lt;P data-line="41"&gt;On save, we grant the agent's managed identity&amp;nbsp;&lt;STRONG&gt;Log Analytics Reader&lt;/STRONG&gt;&amp;nbsp;and&amp;nbsp;&lt;STRONG&gt;Monitoring Reader&lt;/STRONG&gt;&amp;nbsp;on the target resource group. If your account can't assign roles, you can grant them separately.&lt;/P&gt;
&lt;H2 data-line="43"&gt;Backed by Azure MCP&lt;/H2&gt;
&lt;P data-line="45"&gt;Under the hood, this uses the&amp;nbsp;&lt;A href="https://learn.microsoft.com/en-us/azure/developer/azure-mcp-server/" target="_blank" rel="noopener" data-href="https://learn.microsoft.com/en-us/azure/developer/azure-mcp-server/"&gt;Azure MCP Server&lt;/A&gt;&amp;nbsp;with the&amp;nbsp;monitor&amp;nbsp;namespace. When you save your first connector, we spin up an MCP server instance automatically. The agent gets access to tools like:&lt;/P&gt;
&lt;UL data-line="47"&gt;
&lt;LI data-line="47"&gt;&lt;STRONG&gt;monitor_workspace_log_query&amp;nbsp;&lt;/STRONG&gt;- KQL against a workspace&lt;/LI&gt;
&lt;LI data-line="48"&gt;&lt;STRONG&gt;monitor_resource_log_query&lt;/STRONG&gt;&amp;nbsp;- KQL against a specific resource&lt;/LI&gt;
&lt;LI data-line="49"&gt;&lt;STRONG&gt;monitor_workspace_list&lt;/STRONG&gt;&amp;nbsp;- discover workspaces&lt;/LI&gt;
&lt;LI data-line="50"&gt;&lt;STRONG&gt;monitor_table_list&lt;/STRONG&gt;&amp;nbsp;- list tables in a workspace&lt;/LI&gt;
&lt;/UL&gt;
&lt;P data-line="52"&gt;Everything is read-only. The agent can query but never modify your monitoring configuration.&lt;/P&gt;
&lt;P data-line="54"&gt;If different connectors use different managed identities, the system handles per-call identity routing automatically.&lt;/P&gt;
&lt;H2 data-line="56"&gt;What It Looks Like&lt;/H2&gt;
&lt;P data-line="58"&gt;An alert fires on your AKS cluster. The agent starts investigating and queries your connected workspace:&lt;/P&gt;
&lt;LI-CODE lang="kusto"&gt;ContainerLog
| where TimeGenerated &amp;gt; ago(30m)
| where LogEntry contains "error" or LogEntry contains "exception"
| summarize count() by ContainerID, LogEntry | top 10 by count_

KubeEvents
| where TimeGenerated &amp;gt; ago(1h)
| where Reason in ("BackOff", "Failed", "Unhealthy") | summarize count() by Reason, Name, Namespace
| order by count_ desc&lt;/LI-CODE&gt;
&lt;P data-line="78"&gt;The agent also ships with built-in skills for common Log Analytics and App Insights query patterns, so it knows which tables to look at and how to structure queries for typical failure scenarios.&lt;/P&gt;
&lt;H2 data-line="80"&gt;Things to Know&lt;/H2&gt;
&lt;UL data-line="82"&gt;
&lt;LI data-line="83"&gt;&lt;STRONG&gt;Read-only&lt;/STRONG&gt;&amp;nbsp;- the agent can query data but cannot modify alerts, retention, or workspace config&lt;/LI&gt;
&lt;LI data-line="84"&gt;&lt;STRONG&gt;Resource discovery needs Reader&lt;/STRONG&gt;&amp;nbsp;- the dropdown uses Azure Resource Graph. If your resources don't show up, use the manual entry fallback&lt;/LI&gt;
&lt;LI data-line="85"&gt;&lt;STRONG&gt;One identity per connector&lt;/STRONG&gt;&amp;nbsp;- if workspaces need different managed identities, create separate connectors&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2 data-line="87"&gt;Learn More&lt;/H2&gt;
&lt;UL data-line="89"&gt;
&lt;LI data-line="89"&gt;&lt;A href="https://sre.azure.com/docs" target="_blank" rel="noopener" data-href="https://sre.azure.com/docs"&gt;Azure SRE Agent documentation&lt;/A&gt;&lt;/LI&gt;
&lt;LI data-line="90"&gt;&lt;A href="https://learn.microsoft.com/en-us/azure/developer/azure-mcp-server/" target="_blank" rel="noopener" data-href="https://learn.microsoft.com/en-us/azure/developer/azure-mcp-server/"&gt;Azure MCP Server&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;P data-line="92"&gt;We'd love feedback. Try it out and let us know what works and what doesn't.&lt;/P&gt;
&lt;P data-line="96"&gt;&lt;EM&gt;Azure SRE Agent is generally available. Learn more at&amp;nbsp;&lt;A href="https://sre.azure.com/docs" target="_blank" rel="noopener" data-href="https://sre.azure.com/docs"&gt;sre.azure.com/docs&lt;/A&gt;.&lt;/EM&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 17 Apr 2026 02:14:28 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/new-in-azure-sre-agent-log-analytics-and-application-insights/ba-p/4509649</guid>
      <dc:creator>Dalibor_Kovacevic</dc:creator>
      <dc:date>2026-04-17T02:14:28Z</dc:date>
    </item>
    <item>
      <title>Event-Driven IaC Operations with Azure SRE Agent: Terraform Drift Detection via HTTP Triggers</title>
      <link>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/event-driven-iac-operations-with-azure-sre-agent-terraform-drift/ba-p/4512233</link>
      <description>&lt;H2 data-line="6"&gt;What Happens After&amp;nbsp;terraform plan&amp;nbsp;Finds Drift?&lt;/H2&gt;
&lt;P data-line="8"&gt;If your team is like most, the answer looks something like this:&lt;/P&gt;
&lt;OL data-line="10"&gt;
&lt;LI data-line="10"&gt;A nightly&amp;nbsp;terraform plan&amp;nbsp;runs and finds 3 drifted resources&lt;/LI&gt;
&lt;LI data-line="11"&gt;A notification lands in Slack or Teams&lt;/LI&gt;
&lt;LI data-line="12"&gt;Someone files a ticket&lt;/LI&gt;
&lt;LI data-line="13"&gt;During the next sprint, an engineer opens 4 browser tabs — Terraform state, Azure Portal, Activity Log, Application Insights — and spends 30 minutes piecing together&amp;nbsp;&lt;EM&gt;what happened&lt;/EM&gt;&lt;/LI&gt;
&lt;LI data-line="14"&gt;They discover the drift was caused by an on-call engineer who scaled up the App Service during a latency incident at 2 AM&lt;/LI&gt;
&lt;LI data-line="15"&gt;They revert the drift with&amp;nbsp;terraform apply&lt;/LI&gt;
&lt;LI data-line="16"&gt;The app goes down because they just scaled it&amp;nbsp;&lt;EM&gt;back down&lt;/EM&gt;&amp;nbsp;while the bug that caused the incident is still deployed&lt;/LI&gt;
&lt;/OL&gt;
&lt;P data-line="18"&gt;Step 7 is the one nobody talks about. Drift detection tooling has gotten remarkably good — scheduled plans, speculative runs, drift alerts — but the output is always the same:&amp;nbsp;&lt;STRONG&gt;a list of differences&lt;/STRONG&gt;. What changed. Not&amp;nbsp;&lt;EM&gt;why&lt;/EM&gt;. Not&amp;nbsp;&lt;EM&gt;whether it's safe to fix&lt;/EM&gt;.&lt;/P&gt;
&lt;P data-line="20"&gt;The gap isn't detection. It's everything that happens&amp;nbsp;&lt;EM&gt;after&lt;/EM&gt;&amp;nbsp;detection.&lt;/P&gt;
&lt;P data-line="22"&gt;&lt;STRONG&gt;HTTP Triggers in Azure SRE Agent close that gap.&lt;/STRONG&gt;&amp;nbsp;They turn the structured output that drift detection already produces — webhook payloads, plan summaries, run notifications — into the starting point of an autonomous investigation. Detection feeds the agent. The agent does the rest: correlates with incidents, reads source code, classifies severity, recommends context-aware remediation, notifies the team, and even ships a fix.&lt;/P&gt;
&lt;P data-line="24"&gt;Here's what that looks like end to end.&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P data-line="26"&gt;&lt;STRONG&gt;What you'll see in this blog:&lt;/STRONG&gt;&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;An agent that classifies drift as&amp;nbsp;&lt;STRONG&gt;Benign&lt;/STRONG&gt;,&amp;nbsp;&lt;STRONG&gt;Risky&lt;/STRONG&gt;, or&amp;nbsp;&lt;STRONG&gt;Critical&lt;/STRONG&gt;&amp;nbsp;— not just "changed"&lt;/LI&gt;
&lt;LI&gt;Incident correlation that links a SKU change to a latency spike in Application Insights&lt;/LI&gt;
&lt;LI&gt;A remediation recommendation that says&amp;nbsp;&lt;STRONG&gt;"Do NOT revert"&lt;/STRONG&gt;&amp;nbsp;— and why reverting would cause an outage&lt;/LI&gt;
&lt;LI&gt;A Teams notification with the full investigation summary&lt;/LI&gt;
&lt;LI&gt;An agent that reviews its own performance, finds gaps, and&amp;nbsp;&lt;STRONG&gt;improves its own skill file&lt;/STRONG&gt;&lt;/LI&gt;
&lt;LI&gt;A pull request the agent created on its own to fix the root cause&lt;/LI&gt;
&lt;/UL&gt;
&lt;/BLOCKQUOTE&gt;
&lt;H2 data-line="35"&gt;The Pipeline: Detection to Resolution in One Webhook&lt;/H2&gt;
&lt;P data-line="37"&gt;The architecture is straightforward. Terraform Cloud (or any drift detection tool) sends a webhook when it finds drift. An Azure Logic App adds authentication. The SRE Agent's HTTP Trigger receives it and starts an autonomous investigation.&lt;/P&gt;
&lt;img&gt;&lt;STRONG&gt;Terraform Drift Detection Pipeline Architecture — Terraform Cloud sends a webhook to the Azure Logic App auth bridge, which forwards an authenticated request to the SRE Agent HTTP Trigger. The agent then performs drift detection, incident correlation, source code analysis, smart remediation, Teams notification, self-improving skill updates, and auto PR creation.&lt;EM&gt;The end-to-end pipeline: Terraform Cloud detects drift and sends a webhook. The Logic App adds Azure AD authentication via Managed Identity. The SRE Agent's HTTP Trigger fires and the agent autonomously investigates across 7 dimensions.&lt;/EM&gt;&lt;/STRONG&gt;&lt;/img&gt;
&lt;H2 data-line="46"&gt;Setting Up the Pipeline&lt;/H2&gt;
&lt;H3 data-line="48"&gt;Step 1: Deploy the Infrastructure with Terraform&lt;/H3&gt;
&lt;P data-line="50"&gt;We start with a simple Azure App Service running a Node.js application, deployed via Terraform. The Terraform configuration defines the desired state:&lt;/P&gt;
&lt;UL data-line="52"&gt;
&lt;LI data-line="52"&gt;&lt;STRONG&gt;App Service Plan&lt;/STRONG&gt;: B1 (Basic) — single vCPU, ~$13/mo&lt;/LI&gt;
&lt;LI data-line="53"&gt;&lt;STRONG&gt;App Service&lt;/STRONG&gt;: Node 20-lts with TLS 1.2&lt;/LI&gt;
&lt;LI data-line="54"&gt;&lt;STRONG&gt;Tags&lt;/STRONG&gt;:&amp;nbsp;environment: demo,&amp;nbsp;managed_by: terraform,&amp;nbsp;project: sre-agent-iac-blog&lt;/LI&gt;
&lt;/UL&gt;
&lt;LI-CODE lang=""&gt;resource "azurerm_service_plan" "demo" {
  name                = "iacdemo-plan"
  resource_group_name = azurerm_resource_group.demo.name
  location            = azurerm_resource_group.demo.location
  os_type             = "Linux"
  sku_name            = "B1"
}&lt;/LI-CODE&gt;
&lt;P data-line="66"&gt;A Logic App is also deployed to act as the authentication bridge between Terraform Cloud webhooks and the SRE Agent's HTTP Trigger endpoint, using Managed Identity to acquire Azure AD tokens. Learn more about HTTP Triggers &lt;A class="lia-internal-link lia-internal-url lia-internal-url-content-type-blog" href="https://techcommunity.microsoft.com/blog/appsonazureblog/http-triggers-in-azure-sre-agent-from-jira-ticket-to-automated-investigation/4504960?previewMessage=true" target="_blank" rel="noopener" data-lia-auto-title="here" data-lia-auto-title-active="0"&gt;here&lt;/A&gt;.&lt;/P&gt;
&lt;H3 data-line="68"&gt;Step 2: Create the Drift Analysis Skill&lt;/H3&gt;
&lt;P data-line="70"&gt;Skills are domain knowledge files that teach the agent&amp;nbsp;&lt;EM&gt;how&lt;/EM&gt;&amp;nbsp;to approach a problem. We create a&amp;nbsp;terraform-drift-analysis&amp;nbsp;skill with an 8-step workflow:&lt;/P&gt;
&lt;OL data-line="72"&gt;
&lt;LI data-line="72"&gt;&lt;STRONG&gt;Identify Scope&lt;/STRONG&gt;&amp;nbsp;— Which resource group and resources to check&lt;/LI&gt;
&lt;LI data-line="73"&gt;&lt;STRONG&gt;Detect Drift&lt;/STRONG&gt;&amp;nbsp;— Compare Terraform config against Azure reality&lt;/LI&gt;
&lt;LI data-line="74"&gt;&lt;STRONG&gt;Correlate with Incidents&lt;/STRONG&gt;&amp;nbsp;— Check Activity Log and App Insights&lt;/LI&gt;
&lt;LI data-line="75"&gt;&lt;STRONG&gt;Classify Severity&lt;/STRONG&gt;&amp;nbsp;— Benign, Risky, or Critical&lt;/LI&gt;
&lt;LI data-line="76"&gt;&lt;STRONG&gt;Investigate Root Cause&lt;/STRONG&gt;&amp;nbsp;— Read source code from the connected repository&lt;/LI&gt;
&lt;LI data-line="77"&gt;&lt;STRONG&gt;Generate Drift Report&lt;/STRONG&gt;&amp;nbsp;— Structured summary with severity-coded table&lt;/LI&gt;
&lt;LI data-line="78"&gt;&lt;STRONG&gt;Recommend Smart Remediation&lt;/STRONG&gt;&amp;nbsp;— Context-aware: don't blindly revert&lt;/LI&gt;
&lt;LI data-line="79"&gt;&lt;STRONG&gt;Notify Team&lt;/STRONG&gt;&amp;nbsp;— Post findings to Microsoft Teams&lt;/LI&gt;
&lt;/OL&gt;
&lt;P data-line="81"&gt;The key insight in the skill:&amp;nbsp;&lt;STRONG&gt;"NEVER revert critical drift that is actively mitigating an incident."&lt;/STRONG&gt;&amp;nbsp;This teaches the agent to think like an experienced SRE, not just a diff tool.&lt;/P&gt;
&lt;H3 data-line="83"&gt;Step 3: Create the HTTP Trigger&lt;/H3&gt;
&lt;P data-line="85"&gt;In the SRE Agent UI, we create an HTTP Trigger named tfc-drift-handler with a 7-step agent prompt:&lt;/P&gt;
&lt;LI-CODE lang=""&gt;A Terraform Cloud run has completed and detected infrastructure drift.

Workspace: {payload.workspace_name}
Organization: {payload.organization_name}
Run ID: {payload.run_id}
Run Message: {payload.run_message}

STEP 1 — DETECT DRIFT: Compare Terraform configuration against actual Azure state...
STEP 2 — CORRELATE WITH INCIDENTS: Check Azure Activity Log and App Insights...
STEP 3 — CLASSIFY SEVERITY: Rate each drift item as Benign, Risky, or Critical...
STEP 4 — INVESTIGATE ROOT CAUSE: Read the application source code...
STEP 5 — GENERATE DRIFT REPORT: Produce a structured summary...
STEP 6 — RECOMMEND SMART REMEDIATION: Context-aware recommendations...
STEP 7 — NOTIFY TEAM: Post a summary to Microsoft Teams...&lt;/LI-CODE&gt;&lt;img&gt;&lt;EM&gt;&lt;STRONG&gt;HTTP Triggers page showing tfc-drift-handler with status On and 3 completed runsThe HTTP Trigger dashboard showingtfc-drift-handleractive with 3 completed runs.&lt;/STRONG&gt;&lt;/EM&gt;&lt;/img&gt;
&lt;H3 data-line="107"&gt;Step 4: Connect GitHub and Teams&lt;/H3&gt;
&lt;P data-line="109"&gt;We connect two integrations in the SRE Agent Connectors settings:&lt;/P&gt;
&lt;UL data-line="111"&gt;
&lt;LI data-line="111"&gt;&lt;STRONG&gt;Code Repository&lt;/STRONG&gt;: GitHub — so the agent can read application source code during investigations&lt;/LI&gt;
&lt;LI data-line="112"&gt;&lt;STRONG&gt;Notification&lt;/STRONG&gt;: Microsoft Teams — so the agent can post drift reports to the team channel&lt;/LI&gt;
&lt;/UL&gt;
&lt;img&gt;&lt;EM&gt;&lt;STRONG&gt;Connectors page showing Teams and GitHub connectedBoth connectors show "Connected" status — the agent can read source code and notify the team.&lt;/STRONG&gt;&lt;/EM&gt;&lt;/img&gt;
&lt;H2 data-line="119"&gt;The Incident Story&lt;/H2&gt;
&lt;H3 data-line="121"&gt;Act 1: The Latency Bug&lt;/H3&gt;
&lt;P data-line="123"&gt;Our demo app has a subtle but devastating bug. The /api/data endpoint calls processLargeDatasetSync() — a function that sorts an array on every iteration, creating an O(n² log n) blocking operation.&lt;/P&gt;
&lt;P data-line="137"&gt;On a B1 App Service Plan (single vCPU), this blocks the Node.js event loop entirely. Under load, response times spike from milliseconds to&amp;nbsp;&lt;STRONG&gt;25-58 seconds&lt;/STRONG&gt;, with 502 Bad Gateway errors from the Azure load balancer.&lt;/P&gt;
&lt;H3 data-line="139"&gt;Act 2: The On-Call Response&lt;/H3&gt;
&lt;P data-line="141"&gt;An on-call engineer sees the latency alerts and responds — not through Terraform, but directly through the Azure Portal and CLI. They:&lt;/P&gt;
&lt;OL data-line="143"&gt;
&lt;LI data-line="143"&gt;&lt;STRONG&gt;Add diagnostic tags&lt;/STRONG&gt;&amp;nbsp;—&amp;nbsp;manual_update=True,&amp;nbsp;changed_by=portal_user&amp;nbsp;(benign)&lt;/LI&gt;
&lt;LI data-line="144"&gt;&lt;STRONG&gt;Downgrade TLS&lt;/STRONG&gt;&amp;nbsp;from 1.2 to 1.0 while troubleshooting (risky — security regression)&lt;/LI&gt;
&lt;LI data-line="145"&gt;&lt;STRONG&gt;Scale the App Service Plan&lt;/STRONG&gt; from B1 to S1 to throw more compute at the problem (critical — cost increase from ~$13/mo to ~$73/mo)&lt;/LI&gt;
&lt;/OL&gt;
&lt;img&gt;&lt;EM&gt;&lt;STRONG&gt;Azure Portal showing iacdemo-webapp with TLS warning, unauthorized tags, and S1 SKUThe Azure Portal tells the story: a TLS security warning banner across the top, unauthorizedmanual_updateandchanged_bytags, and the App Service Plan upgraded to S1. Three types of drift, all from a single incident response.&lt;/STRONG&gt;&lt;/EM&gt;&lt;/img&gt;
&lt;P data-line="150"&gt;The incident is partially mitigated — S1 has more compute, so latency drops from catastrophic to merely bad. Everyone goes back to sleep. Nobody updates Terraform.&lt;/P&gt;
&lt;H3 data-line="152"&gt;Act 3: The Drift Check Fires&lt;/H3&gt;
&lt;P data-line="154"&gt;The next morning, a nightly speculative Terraform plan runs and detects 3 drifted attributes. The notification webhook fires, flowing through the Logic App auth bridge to the SRE Agent HTTP Trigger.&lt;/P&gt;
&lt;P data-line="156"&gt;The agent wakes up and begins its investigation.&lt;/P&gt;
&lt;H2 data-line="160"&gt;What the Agent Found&lt;/H2&gt;
&lt;H3 data-line="162"&gt;Layer 1: Drift Detection&lt;/H3&gt;
&lt;P data-line="164"&gt;The agent compares Terraform configuration against Azure reality and produces a severity-classified drift report:&lt;/P&gt;
&lt;img&gt;&lt;EM&gt;&lt;STRONG&gt;Drift Report showing Critical, Risky, and Benign drift with Expected vs Actual columnsThe agent's drift report — organized by severity. The "Incident Correlation" column (partially visible) is what makes this more than aterraform planwrapper.&lt;/STRONG&gt;&lt;/EM&gt;&lt;/img&gt;
&lt;P data-line="169"&gt;Three drift items detected:&lt;/P&gt;
&lt;UL data-line="170"&gt;
&lt;LI data-line="170"&gt;&lt;STRONG&gt;Critical&lt;/STRONG&gt;: App Service Plan SKU changed from B1 (~$13/mo) to S1 (~$73/mo) — a +462% cost increase&lt;/LI&gt;
&lt;LI data-line="171"&gt;&lt;STRONG&gt;Risky&lt;/STRONG&gt;: Minimum TLS version downgraded from 1.2 to 1.0 — a security regression vulnerable to BEAST and POODLE attacks&lt;/LI&gt;
&lt;LI data-line="172"&gt;&lt;STRONG&gt;Benign&lt;/STRONG&gt;: Additional tags (changed_by: portal_user,&amp;nbsp;manual_update: True) — cosmetic, no functional impact&lt;/LI&gt;
&lt;/UL&gt;
&lt;H3 data-line="174"&gt;Layer 2: Incident Correlation&lt;/H3&gt;
&lt;P data-line="176"&gt;Here's where the agent goes beyond simple drift detection. It queries Application Insights and discovers a&amp;nbsp;&lt;STRONG&gt;performance incident&lt;/STRONG&gt; correlated with the SKU change:&lt;/P&gt;
&lt;img&gt;&lt;EM&gt;&lt;STRONG&gt;Application Insights analysis showing /api/data endpoint with 25,919ms avg latency and 57,697ms P95The agent found thatGET /api/datais averaging 25,919ms with a P95 of 57,697ms — affecting 97.6% of all requests. It also discovered that the/api/dataendpoint exists in production butnotin the repository source code.&lt;/STRONG&gt;&lt;/EM&gt;&lt;/img&gt;
&lt;P data-line="181"&gt;Key findings from the incident correlation:&lt;/P&gt;
&lt;UL data-line="182"&gt;
&lt;LI data-line="182"&gt;&lt;STRONG&gt;97.6% of requests&lt;/STRONG&gt;&amp;nbsp;(40 of 41) were impacted by high latency&lt;/LI&gt;
&lt;LI data-line="183"&gt;The&amp;nbsp;/api/data&amp;nbsp;endpoint&amp;nbsp;&lt;STRONG&gt;does not exist in the repository source code&lt;/STRONG&gt;&amp;nbsp;— the deployed application has diverged from the codebase&lt;/LI&gt;
&lt;LI data-line="184"&gt;The endpoint likely contains a&amp;nbsp;&lt;STRONG&gt;blocking synchronous pattern&lt;/STRONG&gt;&amp;nbsp;— Node.js runs on a single event loop, and any synchronous blocking call would explain 26-58s response times&lt;/LI&gt;
&lt;LI data-line="185"&gt;The SKU scale-up from B1→S1 was an&amp;nbsp;&lt;STRONG&gt;attempt to mitigate latency&lt;/STRONG&gt;&amp;nbsp;by adding more compute, but scaling cannot fix application-level blocking code on a single-threaded Node.js server&lt;/LI&gt;
&lt;/UL&gt;
&lt;H3 data-line="187"&gt;Layer 3: Smart Remediation&lt;/H3&gt;
&lt;P data-line="189"&gt;This is the insight that separates an autonomous agent from a reporting tool. Instead of blindly recommending "revert all drift," the agent produces&amp;nbsp;&lt;STRONG&gt;context-aware remediation recommendations&lt;/STRONG&gt;:&lt;/P&gt;
&lt;img&gt;&lt;EM&gt;&lt;STRONG&gt;Smart Remediation Recommendations showing different actions for Tags, TLS, and SKU driftThree different recommendations based on context: safe to revert (tags), revert immediately for security (TLS), and critically — do NOT revert the SKU until the code is fixed.&lt;/STRONG&gt;&lt;/EM&gt;&lt;/img&gt;
&lt;P data-line="194"&gt;The agent's remediation logic:&lt;/P&gt;
&lt;OL data-line="196"&gt;
&lt;LI data-line="196"&gt;&lt;STRONG&gt;Tags (Benign)&lt;/STRONG&gt;&amp;nbsp;→ Safe to revert anytime via&amp;nbsp;terraform apply -target&lt;/LI&gt;
&lt;LI data-line="197"&gt;&lt;STRONG&gt;TLS 1.0 (Risky)&lt;/STRONG&gt;&amp;nbsp;→&amp;nbsp;&lt;STRONG&gt;Revert immediately&lt;/STRONG&gt;&amp;nbsp;— the TLS downgrade is a security risk unrelated to the incident&lt;/LI&gt;
&lt;LI data-line="198"&gt;&lt;STRONG&gt;SKU S1 (Critical)&lt;/STRONG&gt;&amp;nbsp;→&amp;nbsp;&lt;STRONG&gt;DO NOT revert&lt;/STRONG&gt; until the&amp;nbsp;/api/data&amp;nbsp;performance root cause is fixed&lt;/LI&gt;
&lt;/OL&gt;
&lt;img&gt;&lt;EM&gt;&lt;STRONG&gt;Agent explaining "Do NOT revert the SKU from S1 back to B1 yet" with recommended action sequence and code fixThe agent explainswhythe SKU shouldn't be reverted: "Reverting to B1 while the/api/datablocking code is still deployed would worsen the performance incident." It then provides a 5-step action sequence and a suggested async code pattern.&lt;/STRONG&gt;&lt;/EM&gt;&lt;/img&gt;
&lt;P data-line="203"&gt;This is the logic an experienced SRE would apply. Blindly running&amp;nbsp;terraform apply&amp;nbsp;to revert all drift would scale the app back down to B1 while the blocking code is still deployed — turning a mitigated incident into an active outage.&lt;/P&gt;
&lt;H3 data-line="205"&gt;Layer 4: Investigation Summary&lt;/H3&gt;
&lt;P data-line="207"&gt;The agent produces a complete summary tying everything together:&lt;/P&gt;
&lt;img&gt;&lt;EM&gt;&lt;STRONG&gt;Investigation summary showing drift table, key findings including actor and performance incident, remediation recommendations, and actions takenThe final summary includes: who made the changes (identified via Activity Log), the performance incident details, the code-infrastructure mismatch finding, and three actions taken — Teams notification, skill improvement, and PR creation.&lt;/STRONG&gt;&lt;/EM&gt;&lt;/img&gt;
&lt;P data-line="212"&gt;Key findings in the summary:&lt;/P&gt;
&lt;UL data-line="213"&gt;
&lt;LI data-line="213"&gt;&lt;STRONG&gt;Actor&lt;/STRONG&gt;:&amp;nbsp;surivineela@microsoft.com&amp;nbsp;made all changes via Azure Portal at ~23:19 UTC&lt;/LI&gt;
&lt;LI data-line="214"&gt;&lt;STRONG&gt;Performance incident&lt;/STRONG&gt;:&amp;nbsp;/api/data&amp;nbsp;averaging 25-57s latency, affecting 97.6% of requests&lt;/LI&gt;
&lt;LI data-line="215"&gt;&lt;STRONG&gt;Code-infrastructure mismatch&lt;/STRONG&gt;:&amp;nbsp;/api/data&amp;nbsp;exists in production but&amp;nbsp;&lt;STRONG&gt;not in the repository&lt;/STRONG&gt;&amp;nbsp;source code&lt;/LI&gt;
&lt;LI data-line="216"&gt;&lt;STRONG&gt;Root cause&lt;/STRONG&gt;: SKU scale-up was emergency incident response, not unauthorized drift&lt;/LI&gt;
&lt;/UL&gt;
&lt;H3 data-line="218"&gt;Layer 5: Teams Notification&lt;/H3&gt;
&lt;P data-line="220"&gt;The agent posts a structured drift report to the team's Microsoft Teams channel:&lt;/P&gt;
&lt;img&gt;&lt;EM&gt;&lt;STRONG&gt;Teams channel showing "Terraform Drift Detected" notification with drift table, performance incident, root cause, and recommended actionsThe Teams notification includes the severity-coded drift table, the performance incident context, the root cause explanation, and the 5-step recommended action sequence — all posted automatically, with a link back to the full SRE Agent investigation thread.&lt;/STRONG&gt;&lt;/EM&gt;&lt;/img&gt;
&lt;P data-line="225"&gt;The on-call engineer opens Teams in the morning and sees everything they need: what drifted, why it drifted, and exactly what to do about it — without logging into any dashboard.&lt;/P&gt;
&lt;H2 data-line="229"&gt;The Payoff: A Self-Improving Agent&lt;/H2&gt;
&lt;P data-line="231"&gt;Here's where the demo surprised us. After completing the investigation, the agent did two things we didn't explicitly ask for.&lt;/P&gt;
&lt;H3 data-line="233"&gt;The Agent Improved Its Own Skill&lt;/H3&gt;
&lt;P data-line="235"&gt;The agent performed an&amp;nbsp;&lt;STRONG&gt;Execution Review&lt;/STRONG&gt; — analyzing what worked and what didn't during its investigation — and found 5 gaps in its own&amp;nbsp;terraform-drift-analysis.md&amp;nbsp;skill file:&lt;/P&gt;
&lt;img&gt;&lt;EM&gt;&lt;STRONG&gt;Execution Review showing what worked well, 5 gaps found in the skill, and the agent editing terraform-drift-analysis.mdThe agent identified gaps including "No incident correlation guidance," "No smart remediation logic," and "No Activity Log integration" — then updated its own skill file with these learnings for next time.&lt;/STRONG&gt;&lt;/EM&gt;&lt;/img&gt;
&lt;P data-line="240"&gt;What worked well:&lt;/P&gt;
&lt;UL data-line="241"&gt;
&lt;LI data-line="241"&gt;Drift detection via az CLI comparison against Terraform HCL was straightforward&lt;/LI&gt;
&lt;LI data-line="242"&gt;Activity Log correlation identified the actor and timing&lt;/LI&gt;
&lt;LI data-line="243"&gt;Application Insights telemetry revealed the performance incident driving the SKU change&lt;/LI&gt;
&lt;/UL&gt;
&lt;P data-line="245"&gt;Gaps it found and fixed:&lt;/P&gt;
&lt;OL data-line="246"&gt;
&lt;LI data-line="246"&gt;&lt;STRONG&gt;No incident correlation guidance&lt;/STRONG&gt;&amp;nbsp;— the skill didn't instruct checking App Insights&lt;/LI&gt;
&lt;LI data-line="247"&gt;&lt;STRONG&gt;No code-infrastructure mismatch detection&lt;/STRONG&gt;&amp;nbsp;— no guidance to verify deployed code matches the repository&lt;/LI&gt;
&lt;LI data-line="248"&gt;&lt;STRONG&gt;No smart remediation logic&lt;/STRONG&gt;&amp;nbsp;— didn't warn against reverting critical drift during active incidents&lt;/LI&gt;
&lt;LI data-line="249"&gt;&lt;STRONG&gt;Report template missing incident correlation column&lt;/STRONG&gt;&lt;/LI&gt;
&lt;LI data-line="250"&gt;&lt;STRONG&gt;No Activity Log integration guidance&lt;/STRONG&gt;&amp;nbsp;— didn't instruct checking who made changes and when&lt;/LI&gt;
&lt;/OL&gt;
&lt;P data-line="252"&gt;The agent then&amp;nbsp;&lt;STRONG&gt;edited its own skill file&lt;/STRONG&gt;&amp;nbsp;to incorporate these learnings. Next time it runs a drift analysis, it will include incident correlation, code-infra mismatch checks, and smart remediation logic&amp;nbsp;&lt;EM&gt;by default&lt;/EM&gt;.&lt;/P&gt;
&lt;P data-line="254"&gt;This is a&amp;nbsp;&lt;STRONG&gt;learning loop&lt;/STRONG&gt;&amp;nbsp;— every investigation makes the agent better at future investigations.&lt;/P&gt;
&lt;H3 data-line="256"&gt;The Agent Created a PR&lt;/H3&gt;
&lt;P data-line="258"&gt;Without being asked, the agent identified the root cause code issue and&amp;nbsp;&lt;STRONG&gt;proactively created a pull request&lt;/STRONG&gt; to fix it:&lt;/P&gt;
&lt;img&gt;&lt;EM&gt;&lt;STRONG&gt;GitHub PR #1 "Improve terraform-drift-analysis skill with incident correlation and smart remediation" showing code changes to server.jsPR #1 — the agent modified bothserver.js(adding safety constants and capping delay values) andterraform-drift-analysis.md(incorporating the learnings from the investigation). Two commits, two files changed, +103/-10 lines.&lt;/STRONG&gt;&lt;/EM&gt;&lt;/img&gt;
&lt;P data-line="263"&gt;The PR includes:&lt;/P&gt;
&lt;UL data-line="264"&gt;
&lt;LI data-line="264"&gt;&lt;STRONG&gt;App safety fixes&lt;/STRONG&gt;: Adding&amp;nbsp;MAX_DELAY_MS&amp;nbsp;and&amp;nbsp;SERVER_TIMEOUT_MS&amp;nbsp;constants to prevent unbounded latency&lt;/LI&gt;
&lt;LI data-line="265"&gt;&lt;STRONG&gt;Skill improvements&lt;/STRONG&gt;: Incorporating incident correlation, code-infra mismatch detection, and smart remediation logic&lt;/LI&gt;
&lt;/UL&gt;
&lt;BLOCKQUOTE&gt;
&lt;P data-line="267"&gt;From a single webhook: drift detected → incident correlated → root cause found → team notified → skill improved → fix shipped.&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;H2 data-line="310"&gt;Key Takeaways&lt;/H2&gt;
&lt;OL data-line="312"&gt;
&lt;LI data-line="312"&gt;&lt;STRONG&gt;Drift detection is not enough.&lt;/STRONG&gt;&amp;nbsp;Knowing that B1 changed to S1 is table stakes. Knowing it changed&amp;nbsp;&lt;EM&gt;because&lt;/EM&gt;&amp;nbsp;of a latency incident, and that reverting it would&amp;nbsp;&lt;EM&gt;cause an outage&lt;/EM&gt;&amp;nbsp;— that's the insight that matters.&lt;/LI&gt;
&lt;LI data-line="314"&gt;&lt;STRONG&gt;Context-aware remediation prevents outages.&lt;/STRONG&gt;&amp;nbsp;Blindly running&amp;nbsp;terraform apply&amp;nbsp;after drift would have scaled the app back to B1 while blocking code was still deployed. The agent's "DO NOT revert SKU" recommendation is the difference between fixing drift and causing a P1.&lt;/LI&gt;
&lt;LI data-line="316"&gt;&lt;STRONG&gt;Skills create a learning loop.&lt;/STRONG&gt;&amp;nbsp;The agent's self-review and skill improvement means every investigation makes the next one better — without human intervention.&lt;/LI&gt;
&lt;LI data-line="318"&gt;&lt;STRONG&gt;HTTP Triggers connect any platform.&lt;/STRONG&gt;&amp;nbsp;The auth bridge pattern (Logic App + Managed Identity) works for Terraform Cloud, but the same architecture applies to any webhook source: GitHub Actions, Jenkins, Datadog, PagerDuty, custom internal tools.&lt;/LI&gt;
&lt;LI data-line="320"&gt;&lt;STRONG&gt;The agent acts, not just reports.&lt;/STRONG&gt; From a single webhook: drift detected, incident correlated, root cause identified, team notified via Teams, skill improved, and PR created. End-to-end in one autonomous session.&lt;/LI&gt;
&lt;/OL&gt;
&lt;H2 data-line="324"&gt;Getting Started&lt;/H2&gt;
&lt;P data-line="326"&gt;HTTP Triggers are available now in Azure SRE Agent:&lt;/P&gt;
&lt;OL data-line="328"&gt;
&lt;LI data-line="328"&gt;&lt;STRONG&gt;Create a Skill&lt;/STRONG&gt;&amp;nbsp;— Teach the agent your operational runbook (in this case, drift analysis with severity classification and smart remediation)&lt;/LI&gt;
&lt;LI data-line="329"&gt;&lt;STRONG&gt;Create an HTTP Trigger&lt;/STRONG&gt;&amp;nbsp;— Define your agent prompt with&amp;nbsp;{payload.X}&amp;nbsp;placeholders and connect it to a skill&lt;/LI&gt;
&lt;LI data-line="330"&gt;&lt;STRONG&gt;Set Up an Auth Bridge&lt;/STRONG&gt;&amp;nbsp;— Deploy a Logic App with Managed Identity to handle Azure AD token acquisition&lt;/LI&gt;
&lt;LI data-line="331"&gt;&lt;STRONG&gt;Connect Your Source&lt;/STRONG&gt;&amp;nbsp;— Point Terraform Cloud (or any webhook-capable platform) at the Logic App URL&lt;/LI&gt;
&lt;LI data-line="332"&gt;&lt;STRONG&gt;Connect GitHub + Teams&lt;/STRONG&gt;&amp;nbsp;— Give the agent access to source code and team notifications&lt;/LI&gt;
&lt;/OL&gt;
&lt;P data-line="334"&gt;Within minutes, you'll have an autonomous pipeline that turns infrastructure drift events into fully contextualized investigations — with incident correlation, root cause analysis, and smart remediation recommendations.&lt;BR /&gt;&lt;BR /&gt;&lt;EM&gt;The full implementation guide, Terraform files, skill definitions, and demo scripts are available in &lt;A class="lia-external-url" href="https://github.com/microsoft/sre-agent/tree/main/samples/terraform-drift-detection" target="_blank"&gt;this&amp;nbsp;&lt;/A&gt;repository.&lt;/EM&gt;&lt;/P&gt;
&lt;P data-line="267"&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 17 Apr 2026 02:13:18 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/event-driven-iac-operations-with-azure-sre-agent-terraform-drift/ba-p/4512233</guid>
      <dc:creator>Vineela-Suri</dc:creator>
      <dc:date>2026-04-17T02:13:18Z</dc:date>
    </item>
    <item>
      <title>Explaining what GitHub Copilot Modernization can (and cannot do)</title>
      <link>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/explaining-what-github-copilot-modernization-can-and-cannot-do/ba-p/4511739</link>
      <description>&lt;P&gt;&lt;A class="lia-internal-link lia-internal-url lia-internal-url-content-type-blog" href="https://techcommunity.microsoft.com/blog/appsonazureblog/what-ai-agents-for-modernization-look-like-in-practice/4506366" target="_blank" rel="noopener" data-lia-auto-title="In the last post" data-lia-auto-title-active="0"&gt;In the last post&lt;/A&gt;, we looked at the workflow: assess, plan, execute. You get reports you can review and the agent makes changes you can inspect.&lt;/P&gt;
&lt;P&gt;If you don’t know, &lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/dotnet/core/porting/github-copilot-app-modernization/overview" target="_blank" rel="noopener"&gt;GitHub Copilot Modernization&lt;/A&gt; is the new agentic tool that supports you to in modernizing older applications. Could it support you with that old 4.8 Framework app, even that forgotten VB.NET script?&lt;/P&gt;
&lt;P&gt;You're probably not modernizing one small app. It is probably a handful of projects, each with its own stack of blockers. Different frameworks, different databases, different dependencies frozen in time because nobody wants to touch them.&lt;/P&gt;
&lt;P&gt;GitHub Copilot modernization handles two big categories: upgrading .NET projects to newer versions and migrating .NET apps to Azure. &lt;SPAN data-teams="true"&gt;But what does that look like&lt;/SPAN&gt;?&lt;/P&gt;
&lt;img /&gt;
&lt;H3&gt;Upgrading .NET Projects&lt;/H3&gt;
&lt;P&gt;Let’s say, you've got an ASP.NET app running on .NET Framework 4.8 or it's a web API stuck on .NET Core 3.1. Unfortunately, getting it to .NET 9 or 10 isn't just updating a target framework property.&lt;/P&gt;
&lt;P&gt;Here's what the upgrade workflow handles in &lt;STRONG&gt;Visual Studio&lt;/STRONG&gt;:&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Assessment first.&lt;/STRONG&gt; - The agent examines your project structure, dependencies, and code patterns. It generates an Assessment Report UI, which shows both the app information, to create the plan, and the Cloud Readiness, for Azure deployment.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&lt;STRONG&gt;Then planning.&lt;/STRONG&gt;&amp;nbsp;- Once you approve the assessment, it moves to planning. Here you get upgrade strategies, refactoring approaches, dependency upgrade paths, and risk mitigations documented in a &lt;EM&gt;plan.md&amp;nbsp;&lt;/EM&gt;file at &lt;EM&gt;.appmod/.migration&lt;/EM&gt;, you can check and edit that Markdown before moving forward or ask in the Copilot Chat window to change it.&lt;/P&gt;
&lt;LI-CODE lang="markdown"&gt;# .NET 10.0 Upgrade Plan

## Execution Steps

Execute steps below sequentially one by one in the order they are listed.

1. Validate that a .NET 10.0 SDK required for this upgrade is installed on the machine and if not, help to get it installed.
2. Ensure that the SDK version specified in global.json files is compatible with the .NET 10.0 upgrade.
3. Upgrade src\eShopLite.StoreFx\eShopLite.StoreFx.csproj

## Settings

This section contains settings and data used by execution steps.

### Excluded projects

No projects are excluded from this upgrade.

### Aggregate NuGet packages modifications across all projects

NuGet packages used across all selected projects or their dependencies that need version update in projects that reference them&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&lt;STRONG&gt;Then execution.&lt;/STRONG&gt;&amp;nbsp;- After you approve the plan, and the agent breaks it into discrete tasks in a &lt;EM&gt;tasks.md&lt;/EM&gt; file. Each task gets validation criteria. As it works, it updates the file with checkboxes and completion percentages so you can track progress. It makes code changes, verifies builds, runs tests. If it hits a problem, it tries to identify the cause and apply a fix.&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;Go to the GitHub Copilot Chat window and type:&amp;nbsp;&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;The plan and progress tracker look good to me. Go ahead with the migration.&lt;/LI&gt;
&lt;/UL&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P&gt;It usually creates Git commits for each portion so you can review what changed or roll back if you need to. In case you don’t have a need for the Git commits for the change, you can ask the agent at the start to not commit anything.&lt;/P&gt;
&lt;P&gt;The agent primarily focuses on ASP.NET, ASP.NET Core, Blazor, Razor Pages, MVC, and Web API. It can also handle Azure Functions, WPF, Windows Forms, console apps, class libraries, and test projects.&lt;/P&gt;
&lt;H3&gt;What It Handles Well (and What It Doesn't)&lt;/H3&gt;
&lt;img /&gt;
&lt;P&gt;The agent is good at code-level transformations: updating&amp;nbsp;&lt;EM&gt;TargetFramework&lt;/EM&gt;&amp;nbsp;in&amp;nbsp;&lt;STRONG&gt;.csproj &lt;/STRONG&gt;files, upgrading NuGet packages, replacing deprecated APIs with their modern equivalents, fixing breaking changes like removed &lt;EM&gt;BinaryFormatter &lt;/EM&gt;methods, running builds, and validating test suites. It can handle repetitive work across multiple projects in a solution without you needing to track every dependency manually.&lt;/P&gt;
&lt;P&gt;It's also solid at applying predefined Azure migration patterns, swapping plaintext credentials for managed identity, replacing file I/O with Azure Blob Storage calls, moving authentication from on-prem Active Directory to Microsoft Entra ID. These are structured transformations with clear before-and-after code patterns.&lt;/P&gt;
&lt;P&gt;But here's where you may need to pay closer attention:&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Language and framework coverage&lt;/STRONG&gt;: It works with C# projects mainly. If your codebase includes complex Entity Framework migrations that rely on hand-tuned database scripts, the agent won't rewrite those for you. It also won't handle third-party UI framework patterns that don't map cleanly to ASP.NET Core conventions that have breaking changes between .NET Framework and later .NET versions. Web Forms migration is underway.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Configuration and infrastructure: &lt;/STRONG&gt;The agent doesn't migrate IIS-specific&amp;nbsp;&lt;EM&gt;web.config&lt;/EM&gt; settings that don't have direct equivalents in Kestrel or ASP.NET Core. It won't automatically set up a CI/CD pipeline or any modernization features; for that, you need to implement it with Copilot’s help. If you've got frontend frameworks bundled with ASP.NET (like an older Angular app served through MVC), you'll need to separate and upgrade that layer yourself.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Learning and memory: &lt;/STRONG&gt;The agent uses your code as context during the session, and if you correct a fix or update the plan, it tries to apply that learning within the same session. But those corrections don't persist across future upgrades. You can encode internal standards using custom skills, but that requires deliberate setup.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Offline and deployment:&lt;/STRONG&gt; There's no offline mode. The agent needs connectivity to run. And while it can help prepare your app for Azure deployment, it doesn't manage the actual infrastructure provisioning or ongoing operations, that's still on you.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Guarantees&lt;/STRONG&gt;: The suggestions aren't guaranteed to follow best practices. The agent won't always pick the best migration path. It won't catch every edge case. You're reviewing the work; pay attention to the results before putting it into production.&lt;/P&gt;
&lt;P&gt;What it does handle: the tedious parts. Reading dependency graphs. Finding all the places a deprecated API is used. Updating project files. Writing boilerplate for managed identity. Fixing compilation errors that follow a predictable pattern.&lt;/P&gt;
&lt;H3&gt;Where to Start&lt;/H3&gt;
&lt;P&gt;If you've been staring at a modernization backlog, pick one project. See what it comes up with! You don't have to commit to upgrading your entire portfolio. Try it on one project and see if it saves you time. Modernization at scale still happens application by application, repo by repo, and decision by decision. &lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/dotnet/core/porting/github-copilot-app-modernization/overview" target="_blank" rel="noopener"&gt;GitHub Copilot modernization&lt;/A&gt; just makes each one a little less painful. Experiment with it!&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 16 Apr 2026 21:45:41 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/explaining-what-github-copilot-modernization-can-and-cannot-do/ba-p/4511739</guid>
      <dc:creator>PabloLopes</dc:creator>
      <dc:date>2026-04-16T21:45:41Z</dc:date>
    </item>
    <item>
      <title>Managing Multi‑Tenant Azure Resource with SRE Agent and Lighthouse</title>
      <link>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/managing-multi-tenant-azure-resource-with-sre-agent-and/ba-p/4511789</link>
      <description>&lt;P&gt;&lt;A href="https://azure.microsoft.com/en-us/products/sre-agent" target="_blank"&gt;Azure SRE Agent&lt;/A&gt; is an AI‑powered reliability assistant that helps teams diagnose and resolve production issues faster while reducing operational toil. It analyzes logs, metrics, &lt;SPAN class="lia-text-color-21"&gt;alerts,&lt;/SPAN&gt; and deployment data to perform root cause analysis and recommend or execute mitigations with human approval. It’s capable of integrating with azure services across subscriptions and resource groups that you need to monitor and manage. Today’s enterprise customers live in a multi-tenant world, and there are multiple reasons to that due to acquisitions, complex corporate structures, managed service providers, or IT partners. Azure &lt;A href="https://learn.microsoft.com/en-us/azure/lighthouse/overview" target="_blank"&gt;Lighthouse&lt;/A&gt; enables enterprise IT teams and managed service providers to manage resources across multiple azure tenants from a single control plane.&lt;/P&gt;
&lt;P&gt;In this demo I will walk you through how to set up Azure SRE agent to manage and monitor multi-tenant resources delegated through Azure Lighthouse.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;Navigate to the Azure SRE agent and select&amp;nbsp;&lt;STRONG&gt;Create agent&lt;/STRONG&gt;. Fill in the required details along with the deployment region and deploy the SRE agent.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;Once the deployment is complete, hit&amp;nbsp;&lt;STRONG&gt;Set up your agent&lt;/STRONG&gt;. Select the&amp;nbsp;&lt;STRONG&gt;Azure resources&lt;/STRONG&gt;&amp;nbsp;you would like your agent to analyze like resource groups or subscriptions.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;This will land you to the popup window that allows you to select the subscriptions and resource groups that you would like SRE agent to monitor and manage. You can then select the subscriptions and resource groups under the same tenant that you want SRE agent to manage; Great, So far so good 👍&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;As a Managed Service Provider (MSP) you have multiple tenants that you are managing via Azure Lighthouse, and you need to have SRE agent access to those.&lt;/P&gt;
&lt;P&gt;So, to demo this will need to set up Azure Lighthouse with correct set of roles and configuration to delegate access to management subscription where the Centralized SRE agent is running.&lt;/P&gt;
&lt;P&gt;From Azure portal search Lighthouse. Navigate to the Lighthouse home page and select&amp;nbsp;&lt;STRONG&gt;Manage your customers&lt;/STRONG&gt;. On My customers Overview select&amp;nbsp;&lt;STRONG&gt;Create ARM Template&lt;/STRONG&gt;&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;Provide a Name and Description. Select subscriptions on a Delegated scope. Select&amp;nbsp;&lt;STRONG&gt;+ Add authorization&lt;/STRONG&gt;&amp;nbsp;which will take you to Add authorization window. Select Principal type, I am selecting User for demo purposes. The pop-up window will allow &lt;STRONG&gt;Select users&lt;/STRONG&gt; from the list.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Select the checkbox next to the desired user who you want to delegate the subscription and hit &lt;STRONG&gt;Select&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Then select the&amp;nbsp;&lt;STRONG&gt;Role&lt;/STRONG&gt;&amp;nbsp;that you would like to assign the user from the managing tenant to the delegated tenant and select&amp;nbsp;&lt;STRONG&gt;add&lt;/STRONG&gt;. You can add multiple roles by adding additional authorization to the selected user. This step is important to make sure the delegated tenant is assigned with the right role in order for SRE Agents to add it as Azure source.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Azure SRE agent requires an Owner or User Administrator RBAC role to assign the subscription to the list of managed resources. If an appropriate role is not assigned, you will see an error when selecting the delegated subscriptions in SRE agent Managed resources.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;As per Lighthouse role support Owner role isn’t supported and User access Administrator role is supported, but only for limited purpose. Refer Azure Lighthouse &lt;A href="https://docs.azure.cn/en-us/lighthouse/concepts/tenants-users-roles#role-support-for-azure-lighthouse" target="_blank"&gt;documentation&lt;/A&gt;&amp;nbsp;for additional information. If role is not defined correctly, you might see an error stating: 🛑&lt;STRONG&gt;Failed to add Role assignment&lt;/STRONG&gt;&amp;nbsp;“The 'delegatedRoleDefinitionIds' property is required when using certain roleDefinitionIds for authorization.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;To allow a &lt;STRONG&gt;principalId&lt;/STRONG&gt;&amp;nbsp;to assign roles to a managed identity in the customer tenant, set its&amp;nbsp;&lt;STRONG&gt;roleDefinitionId&lt;/STRONG&gt;&amp;nbsp;to&amp;nbsp;&lt;STRONG&gt;User Access Administrator&lt;/STRONG&gt;. Download the ARM template and add specific &lt;A href="https://docs.azure.cn/en-us/role-based-access-control/built-in-roles" target="_blank"&gt;Azure built-in roles&lt;/A&gt; that you want to grant in the&amp;nbsp;&lt;STRONG&gt;delegatedRoleDefinitionIds&lt;/STRONG&gt;&amp;nbsp;property. You can include any supported Azure built-in role except for User Access Administrator or Owner. This example shows a principalId with User Access Administrator role that can assign two built in roles to managed identities in the customer tenant: Contributor and Log Analytics Contributor.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;LI-CODE lang=""&gt;{
    "principalId": "00000000-0000-0000-0000-000000000000",
    "principalIdDisplayName": "Policy Automation Account",
    "roleDefinitionId": "18d7d88d-d35e-4fb5-a5c3-7773c20a72d9",
    "delegatedRoleDefinitionIds": [
         "b24988ac-6180-42a0-ab88-20f7382dd24c",
         "92aaf0da-9dab-42b6-94a3-d43ce8d16293"
    ]
}&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;In addition SRE agent would require certain roles at the managed identity level in order to access and operate on those services. Locate SRE agent User assigned managed identity and add roles to the service principal. For the demo purpose I am assigning Reader, Monitoring Reader, and Log Analytics Reader role.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;Here is the sample ARM template used for this demo.&lt;/P&gt;
&lt;LI-CODE lang=""&gt;{
  "$schema": "https://schema.management.azure.com/schemas/2019-08-01/subscriptionDeploymentTemplate.json#",
  "contentVersion": "1.0.0.0",
  "parameters": {
    "mspOfferName": {
      "type": "string",
      "metadata": {
        "description": "Specify a unique name for your offer"
      },
      "defaultValue": "lighthouse-sre-demo"
    },
    "mspOfferDescription": {
      "type": "string",
      "metadata": {
        "description": "Name of the Managed Service Provider offering"
      },
      "defaultValue": "lighthouse-sre-demo"
    }
  },
  "variables": {
    "mspRegistrationName": "[guid(parameters('mspOfferName'))]",
    "mspAssignmentName": "[guid(parameters('mspOfferName'))]",
    "managedByTenantId": "6e03bca1-4300-400d-9e80-000000000000",
    "authorizations": [
      {
        "principalId": "504adfc5-da83-47d4-8709-000000000000",
        "roleDefinitionId": "e40ec5ca-96e0-45a2-b4ff-59039f2c2b59",
        "principalIdDisplayName": "Pranab Mandal"
      },
      {
        "principalId": "504adfc5-da83-47d4-8709-000000000000",
        "roleDefinitionId": "18d7d88d-d35e-4fb5-a5c3-7773c20a72d9",
        "delegatedRoleDefinitionIds": [
          "b24988ac-6180-42a0-ab88-20f7382dd24c",
          "92aaf0da-9dab-42b6-94a3-d43ce8d16293"
        ],
        "principalIdDisplayName": "Pranab Mandal"
      },
      {
        "principalId": "504adfc5-da83-47d4-8709-000000000000",
        "roleDefinitionId": "b24988ac-6180-42a0-ab88-20f7382dd24c",
        "principalIdDisplayName": "Pranab Mandal"
      },
      {
        "principalId": "0374ff5c-5272-49fa-878a-000000000000",
        "roleDefinitionId": "acdd72a7-3385-48ef-bd42-f606fba81ae7",
        "principalIdDisplayName": "sre-agent-ext-sub1-4n4y4v5jjdtuu"
      },
      {
        "principalId": "0374ff5c-5272-49fa-878a-000000000000",
        "roleDefinitionId": "43d0d8ad-25c7-4714-9337-8ba259a9fe05",
        "principalIdDisplayName": "sre-agent-ext-sub1-4n4y4v5jjdtuu"
      },
      {
        "principalId": "0374ff5c-5272-49fa-878a-000000000000",
        "roleDefinitionId": "73c42c96-874c-492b-b04d-ab87d138a893",
        "principalIdDisplayName": "sre-agent-ext-sub1-4n4y4v5jjdtuu"
      }
    ]
  },
  "resources": [
    {
      "type": "Microsoft.ManagedServices/registrationDefinitions",
      "apiVersion": "2022-10-01",
      "name": "[variables('mspRegistrationName')]",
      "properties": {
        "registrationDefinitionName": "[parameters('mspOfferName')]",
        "description": "[parameters('mspOfferDescription')]",
        "managedByTenantId": "[variables('managedByTenantId')]",
        "authorizations": "[variables('authorizations')]"
      }
    },
    {
      "type": "Microsoft.ManagedServices/registrationAssignments",
      "apiVersion": "2022-10-01",
      "name": "[variables('mspAssignmentName')]",
      "dependsOn": [
        "[resourceId('Microsoft.ManagedServices/registrationDefinitions/', variables('mspRegistrationName'))]"
      ],
      "properties": {
        "registrationDefinitionId": "[resourceId('Microsoft.ManagedServices/registrationDefinitions/', variables('mspRegistrationName'))]"
      }
    }
  ],
  "outputs": {
    "mspOfferName": {
      "type": "string",
      "value": "[concat('Managed by', ' ', parameters('mspOfferName'))]"
    },
    "authorizations": {
      "type": "array",
      "value": "[variables('authorizations')]"
    }
  }
}&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Login to the customers tenant and navigate to the&amp;nbsp;&lt;STRONG&gt;service provides&lt;/STRONG&gt; from the Azure Portal. From the Service Providers overview screen, select S&lt;STRONG&gt;ervice provider offers&lt;/STRONG&gt;&amp;nbsp;from the left navigation pane. From the top menu, select the&amp;nbsp;&lt;STRONG&gt;Add offer&lt;/STRONG&gt;&amp;nbsp;drop down and select &lt;STRONG&gt;Add via template&lt;/STRONG&gt;.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;In the&amp;nbsp;&lt;STRONG&gt;Upload Offer Template&lt;/STRONG&gt;&amp;nbsp;window drag and drop or upload the template file that was created in the earlier step and hit &lt;STRONG&gt;Upload&lt;/STRONG&gt;. Once the file is uploaded, select&amp;nbsp;&lt;STRONG&gt;Review + Create&lt;/STRONG&gt;. This will take a few minutes to deploy the template, and a successful deployment page should be displayed.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Navigate to&amp;nbsp;&lt;STRONG&gt;Delegations&lt;/STRONG&gt; from Lighthouse overview and validate if you see the delegated subscription and the assigned role. Once the Lighthouse delegation is set up sign in to the managing tenant and navigate to the deployed SRE agent. Navigate to Azure resources from top menu or via &lt;STRONG&gt;Settings&lt;/STRONG&gt; &lt;STRONG&gt;&amp;gt;&lt;/STRONG&gt;&amp;nbsp;&lt;STRONG&gt;Managed resources&lt;/STRONG&gt;. Navigate to&amp;nbsp;&lt;STRONG&gt;Add subscriptions&lt;/STRONG&gt; to select customers subscriptions that you need SRE agent to manage.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;Adding subscription will automatically add required permission for the agent.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Once the appropriate roles are added, the subscriptions are ready for the agent to manage and monitor resources within them.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;H1&gt;Summary - Benefits&lt;/H1&gt;
&lt;P&gt;This blog post demonstrates how &lt;STRONG&gt;Azure SRE Agent&lt;/STRONG&gt; can be used to centrally monitor and manage Azure resources across multiple tenants by integrating it with &lt;STRONG&gt;Azure Lighthouse&lt;/STRONG&gt;, a common requirement for enterprises and managed service providers operating in complex, multi-tenant environments. It walks through:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Centralized SRE operations across multiple Azure tenants&lt;/LI&gt;
&lt;LI&gt;Secure, role-based access using delegated resource management&lt;/LI&gt;
&lt;LI&gt;Reduced operational overhead for MSPs and enterprise IT teams&lt;/LI&gt;
&lt;LI&gt;Unified visibility into resource health and reliability across customer environments&lt;/LI&gt;
&lt;/UL&gt;</description>
      <pubDate>Thu, 16 Apr 2026 04:58:53 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/managing-multi-tenant-azure-resource-with-sre-agent-and/ba-p/4511789</guid>
      <dc:creator>Pranab_Mandal</dc:creator>
      <dc:date>2026-04-16T04:58:53Z</dc:date>
    </item>
    <item>
      <title>Using an AI Agent to Troubleshoot and Fix Azure Function App Issues</title>
      <link>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/using-an-ai-agent-to-troubleshoot-and-fix-azure-function-app/ba-p/4511781</link>
      <description>&lt;P data-start="107" data-end="116"&gt;&lt;STRONG data-start="107" data-end="114"&gt;TOC&lt;/STRONG&gt;&lt;/P&gt;
&lt;OL data-start="117" data-end="179"&gt;
&lt;LI data-section-id="1909l2b" data-start="117" data-end="133"&gt;Preparation&lt;/LI&gt;
&lt;LI data-section-id="1nc0tbm" data-start="134" data-end="163"&gt;Troubleshooting Workflow&lt;/LI&gt;
&lt;LI data-section-id="ljknmj" data-start="164" data-end="179"&gt;Conclusion&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;H3&gt;Preparation&lt;/H3&gt;
&lt;P data-start="200" data-end="225"&gt;&lt;STRONG data-start="200" data-end="225"&gt;Topic: Required tools&lt;/STRONG&gt;&lt;/P&gt;
&lt;UL data-start="226" data-end="497"&gt;
&lt;LI data-section-id="1lzpisp" data-start="226" data-end="336"&gt;AI agent: for example, Copilot CLI / OpenCode / Hermes / OpenClaw, etc. In this example, we use Copilot CLI.&lt;/LI&gt;
&lt;LI data-section-id="w8sxy7" data-start="337" data-end="388"&gt;Model access: for example, Anthropic Claude Opus.&lt;/LI&gt;
&lt;LI data-section-id="15ko278" data-start="389" data-end="497"&gt;Relevant skills: this example does not use skills, but using relevant skills can speed up troubleshooting.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P data-start="499" data-end="674"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P data-start="499" data-end="674"&gt;&lt;STRONG data-start="499" data-end="674"&gt;Topic: Compliant with your organization&lt;/STRONG&gt;&lt;/P&gt;
&lt;UL data-start="675" data-end="867"&gt;
&lt;LI data-section-id="17c1u0u" data-start="675" data-end="790"&gt;Enterprise-level projects are sensitive, so you must confirm with the appropriate stakeholders before using them.&lt;/LI&gt;
&lt;LI data-section-id="1m0x6tj" data-start="791" data-end="867"&gt;Enterprise environments may also have strict standards for AI agent usage.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P data-start="869" data-end="899"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P data-start="869" data-end="899"&gt;&lt;STRONG data-start="869" data-end="899"&gt;Topic: Network limitations&lt;/STRONG&gt;&lt;/P&gt;
&lt;UL data-start="900" data-end="1309"&gt;
&lt;LI data-section-id="o9qxws" data-start="900" data-end="1096"&gt;If the process involves restarting the Function App container or restarting related settings, communication between the user and the agent may be interrupted, and you will need to use /resume.&lt;/LI&gt;
&lt;LI data-section-id="uceb18" data-start="1097" data-end="1193"&gt;If the agent needs internet access for investigation, the app must have outbound connectivity.&lt;/LI&gt;
&lt;LI data-section-id="69kpej" data-start="1194" data-end="1309"&gt;If the Kudu container cannot be used because of network issues, this type of investigation cannot be carried out.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P data-start="1311" data-end="1344"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P data-start="1311" data-end="1344"&gt;&lt;STRONG data-start="1311" data-end="1344"&gt;Topic: Permission limitations&lt;/STRONG&gt;&lt;/P&gt;
&lt;UL data-start="1345" data-end="1717"&gt;
&lt;LI data-section-id="ztwgp" data-start="1345" data-end="1574"&gt;If you are using Azure blessed images, according to the official documentation, the containers use the fixed password Docker!. However, if you are using a custom container, you will need to provide an additional login method.&lt;/LI&gt;
&lt;LI data-section-id="8aguhn" data-start="1575" data-end="1717"&gt;For resources the agent does not already have permission to investigate, you will need to enable SAMI and assign the appropriate RBAC roles.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;H3&gt;Troubleshooting Workflow&lt;/H3&gt;
&lt;P data-start="1751" data-end="1927"&gt;Let’s use a classic case where an HTTP trigger cannot be tested from the Azure Portal. As you can see, when clicking &lt;STRONG data-start="1868" data-end="1880"&gt;Test/Run&lt;/STRONG&gt; in the Azure Portal, an error message appears.&lt;/P&gt;
&lt;img /&gt;
&lt;P class="lia-clear-both" data-start="1751" data-end="1927"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P data-start="1929" data-end="2004"&gt;At the same time, however, the home page does not show any abnormal status.&lt;/P&gt;
&lt;img /&gt;
&lt;P class="lia-clear-both" data-start="1929" data-end="2004"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P data-start="2006" data-end="2330"&gt;At this point, we first obtain the Function App’s SAMI and assign it the &lt;STRONG data-start="2079" data-end="2088"&gt;Owner&lt;/STRONG&gt; role for the entire resource group. This is only for demonstration purposes. In practice, you should follow the principle of least privilege and scope permissions down to only the specific resources and operations that are actually required.&lt;/P&gt;
&lt;img /&gt;
&lt;P class="lia-clear-both" data-start="2006" data-end="2330"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P data-start="2332" data-end="2430"&gt;Next, go to the Kudu container, which is the always-on maintenance container dedicated to the app.&lt;/P&gt;
&lt;img /&gt;
&lt;P class="lia-clear-both" data-start="2332" data-end="2430"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P data-start="2432" data-end="2463"&gt;Install and enable Copilot CLI.&lt;/P&gt;
&lt;img /&gt;
&lt;P class="lia-clear-both" data-start="2432" data-end="2463"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P data-start="2465" data-end="2518"&gt;Then we can describe the problem we are encountering.&lt;/P&gt;
&lt;img /&gt;
&lt;P class="lia-clear-both" data-start="2465" data-end="2518"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P data-start="2520" data-end="2823"&gt;After the agent processes the issue and interacts with you further, it can generate a reasonable investigation report. In this example, it appears that the Function App’s Storage Account access key had been rotated previously, but the Function App had not updated the corresponding environment variable.&lt;/P&gt;
&lt;img /&gt;
&lt;P class="lia-clear-both" data-start="2520" data-end="2823"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P data-start="2825" data-end="3080"&gt;Once we understand the issue, we could perform the follow-up actions ourselves. However, to demonstrate the agent’s capabilities, you can also allow it to fix the problem directly, provided that you have granted the corresponding permissions through SAMI.&lt;/P&gt;
&lt;img /&gt;
&lt;P class="lia-clear-both" data-start="2825" data-end="3080"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P data-start="3082" data-end="3253"&gt;During the process, the container restart will disconnect the session, so you will need to return to the Kudu container and resume the previous session so it can continue.&lt;/P&gt;
&lt;img /&gt;
&lt;P class="lia-clear-both" data-start="3082" data-end="3253"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P data-start="3255" data-end="3351"&gt;Finally, it will inform you that the issue has been fixed, and then you can validate the result.&lt;/P&gt;
&lt;img /&gt;
&lt;P class="lia-clear-both" data-start="3255" data-end="3351"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P data-start="3353" data-end="3428"&gt;This is the validation result, and it looks like the repair was successful.&lt;/P&gt;
&lt;img /&gt;
&lt;P class="lia-clear-both" data-start="3353" data-end="3428"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;H3&gt;Conclusion&lt;/H3&gt;
&lt;P&gt;After each repair, we can even extract the experience from that case into a skill and store it in a Storage Account for future reuse. In this way, we can not only reduce the agent’s initial investigation time for similar issues, but also save tokens. This makes both time and cost management more efficient.&lt;/P&gt;</description>
      <pubDate>Thu, 16 Apr 2026 03:11:32 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/using-an-ai-agent-to-troubleshoot-and-fix-azure-function-app/ba-p/4511781</guid>
      <dc:creator>theringe</dc:creator>
      <dc:date>2026-04-16T03:11:32Z</dc:date>
    </item>
    <item>
      <title>Gemma 4 on Azure Container Apps Serverless GPU</title>
      <link>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/gemma-4-on-azure-container-apps-serverless-gpu/ba-p/4511671</link>
      <description>&lt;P&gt;Every prompt you send to a hosted AI service leaves your tenant. Your code, your architecture decisions, your proprietary logic — all of it crosses a network boundary you don't control. For teams building in regulated industries or handling sensitive IP, that's not a philosophical concern. It's a compliance blocker.&lt;/P&gt;
&lt;P&gt;What if you could spin up a fully private AI coding agent — running on your own GPU, in your own Azure subscription — with a single command?&lt;/P&gt;
&lt;P&gt;That's exactly what this template does. &lt;STRONG&gt;One &lt;CODE&gt;azd up&lt;/CODE&gt;, 15 minutes, and you have Google's Gemma 4 running on Azure Container Apps serverless GPU with an OpenAI-compatible API, protected by auth, and ready to power OpenCode as your terminal-based coding agent.&lt;/STRONG&gt; No data leaves your environment. No third-party model provider sees your code. Full control.&lt;/P&gt;
&lt;H2&gt;Why Self-Hosted AI on ACA?&lt;/H2&gt;
&lt;P&gt;Azure Container Apps serverless GPU gives you on-demand GPU compute without managing VMs, Kubernetes clusters, or GPU drivers. You get a container, a GPU, and an HTTPS endpoint — Azure handles the rest.&lt;/P&gt;
&lt;P&gt;Here's what makes this approach different from calling a hosted model API:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Complete data privacy&lt;/STRONG&gt; — your code and prompts never leave your Azure subscription. No PII exposure, no data leakage, no third-party processing. For teams navigating HIPAA, SOC 2, or internal IP policies, this is the simplest path to compliant AI-assisted development.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Predictable costs&lt;/STRONG&gt; — you pay for GPU compute time, not per-token. Run as many prompts as you want against your deployed model.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;No rate limits&lt;/STRONG&gt; — the GPU is yours. No throttling, no queue, no waiting for capacity.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Model flexibility&lt;/STRONG&gt; — swap models in minutes. Start with the 4B parameter Gemma 4 for fast iteration, scale up to 26B for complex reasoning tasks.&lt;/LI&gt;
&lt;/UL&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;STRONG&gt;This isn't a tradeoff between convenience and privacy.&lt;/STRONG&gt; ACA serverless GPU makes self-hosted AI as easy to deploy as any SaaS endpoint — but the data stays yours.&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;H2&gt;What You're Building&lt;/H2&gt;
&lt;img&gt;What does the configuration look like to run Gemma 4 + Ollama securely on ACA serverless GPU&lt;/img&gt;
&lt;P&gt;The template deploys two containers into an Azure Container Apps environment:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;&lt;STRONG&gt;Ollama + Gemma 4&lt;/STRONG&gt; — running on a serverless GPU (NVIDIA T4 or A100), serving an OpenAI-compatible API&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Nginx auth proxy&lt;/STRONG&gt; — a lightweight reverse proxy that adds basic authentication and exposes the endpoint over HTTPS&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;The Ollama container pulls the Gemma 4 model on first start, so there's nothing to pre-build or upload. The nginx proxy runs on the free Consumption profile — only the Ollama container needs GPU.&lt;/P&gt;
&lt;P&gt;After deployment, you get a single HTTPS endpoint that works with &lt;CODE&gt;curl&lt;/CODE&gt;, any OpenAI-compatible SDK, or &lt;STRONG&gt;OpenCode&lt;/STRONG&gt; — a terminal-based AI coding agent that turns the whole thing into a private GitHub Copilot alternative.&lt;/P&gt;
&lt;H2&gt;Step 1: Deploy with &lt;CODE&gt;azd up&lt;/CODE&gt;&lt;/H2&gt;
&lt;P&gt;You need the &lt;A href="https://docs.microsoft.com/en-us/cli/azure/install-azure-cli" target="_blank" rel="noopener"&gt;Azure CLI&lt;/A&gt; and &lt;A href="https://learn.microsoft.com/en-us/azure/developer/azure-developer-cli/" target="_blank" rel="noopener"&gt;Azure Developer CLI (azd)&lt;/A&gt; installed.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;git clone https://github.com/simonjj/gemma4-on-aca.git
cd gemma4-on-aca
azd up&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;The setup walks you through three choices:&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;GPU selection&lt;/STRONG&gt; — T4 (16 GB VRAM) for smaller models, or A100 (80 GB VRAM) for the full Gemma 4 lineup.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Model selection&lt;/STRONG&gt; — depends on your GPU choice. The defaults are tuned for the best quality-to-speed ratio on each GPU tier.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Proxy password&lt;/STRONG&gt; — protects your endpoint with basic auth.&lt;/P&gt;
&lt;!-- INSERT IMAGE: Screenshot of GPU and model selection prompt in terminal --&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;STRONG&gt;Region availability:&lt;/STRONG&gt; Serverless GPUs are available in &lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/azure/container-apps/gpu-serverless-overview#supported-regions" target="_blank" rel="noopener"&gt;various regoins&lt;/A&gt; such as &lt;CODE&gt;australiaeast&lt;/CODE&gt;, &lt;CODE&gt;brazilsouth&lt;/CODE&gt;, &lt;CODE&gt;canadacentral&lt;/CODE&gt;, &lt;CODE&gt;eastus&lt;/CODE&gt;, &lt;CODE&gt;italynorth&lt;/CODE&gt;, &lt;CODE&gt;swedencentral&lt;/CODE&gt;, &lt;CODE&gt;uksouth&lt;/CODE&gt;, &lt;CODE&gt;westus&lt;/CODE&gt;, and &lt;CODE&gt;westus3&lt;/CODE&gt;. Pick one of these when prompted for location.&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P&gt;That's it. Provisioning takes about 10 minutes — mostly waiting for the ACA environment to create and the model to download.&lt;/P&gt;
&lt;img&gt;The deployment output&lt;/img&gt;
&lt;H2&gt;Choose Your Model&lt;/H2&gt;
&lt;P&gt;Gemma 4 ships in four sizes. The right choice depends on your GPU and workload:&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;&lt;STRONG&gt;Model&lt;/STRONG&gt;&lt;/th&gt;&lt;th&gt;&lt;STRONG&gt;Params&lt;/STRONG&gt;&lt;/th&gt;&lt;th&gt;&lt;STRONG&gt;Architecture&lt;/STRONG&gt;&lt;/th&gt;&lt;th&gt;&lt;STRONG&gt;Context&lt;/STRONG&gt;&lt;/th&gt;&lt;th&gt;&lt;STRONG&gt;Modalities&lt;/STRONG&gt;&lt;/th&gt;&lt;th&gt;&lt;STRONG&gt;Disk Size&lt;/STRONG&gt;&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;CODE&gt;gemma4:e2b&lt;/CODE&gt;&lt;/td&gt;&lt;td&gt;~2B&lt;/td&gt;&lt;td&gt;Dense&lt;/td&gt;&lt;td&gt;128K&lt;/td&gt;&lt;td&gt;Text, Image, Audio&lt;/td&gt;&lt;td&gt;~7 GB&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;CODE&gt;gemma4:e4b&lt;/CODE&gt;&lt;/td&gt;&lt;td&gt;~4B&lt;/td&gt;&lt;td&gt;Dense&lt;/td&gt;&lt;td&gt;128K&lt;/td&gt;&lt;td&gt;Text, Image, Audio&lt;/td&gt;&lt;td&gt;~10 GB&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;CODE&gt;gemma4:26b&lt;/CODE&gt;&lt;/td&gt;&lt;td&gt;26B&lt;/td&gt;&lt;td&gt;MoE (4B active)&lt;/td&gt;&lt;td&gt;256K&lt;/td&gt;&lt;td&gt;Text, Image&lt;/td&gt;&lt;td&gt;~18 GB&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;CODE&gt;gemma4:31b&lt;/CODE&gt;&lt;/td&gt;&lt;td&gt;31B&lt;/td&gt;&lt;td&gt;Dense&lt;/td&gt;&lt;td&gt;256K&lt;/td&gt;&lt;td&gt;Text, Image&lt;/td&gt;&lt;td&gt;~20 GB&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 16.67%" /&gt;&lt;col style="width: 16.67%" /&gt;&lt;col style="width: 16.67%" /&gt;&lt;col style="width: 16.67%" /&gt;&lt;col style="width: 16.67%" /&gt;&lt;col style="width: 16.67%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;H3&gt;Real-World Performance on ACA&lt;/H3&gt;
&lt;P&gt;We benchmarked every model on both GPU tiers using Ollama v0.20 with Q4_K_M quantization and 32K context in Sweden Central:&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;&lt;STRONG&gt;Model&lt;/STRONG&gt;&lt;/th&gt;&lt;th&gt;&lt;STRONG&gt;GPU&lt;/STRONG&gt;&lt;/th&gt;&lt;th&gt;&lt;STRONG&gt;Tokens/sec&lt;/STRONG&gt;&lt;/th&gt;&lt;th&gt;&lt;STRONG&gt;TTFT&lt;/STRONG&gt;&lt;/th&gt;&lt;th&gt;&lt;STRONG&gt;Notes&lt;/STRONG&gt;&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;CODE&gt;gemma4:e2b&lt;/CODE&gt;&lt;/td&gt;&lt;td&gt;T4&lt;/td&gt;&lt;td&gt;~81&lt;/td&gt;&lt;td&gt;~15ms&lt;/td&gt;&lt;td&gt;Fastest on T4&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;CODE&gt;gemma4:e4b&lt;/CODE&gt;&lt;/td&gt;&lt;td&gt;T4&lt;/td&gt;&lt;td&gt;~51&lt;/td&gt;&lt;td&gt;~17ms&lt;/td&gt;&lt;td&gt;&lt;STRONG&gt;Default T4 choice&lt;/STRONG&gt; — best quality/speed&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;CODE&gt;gemma4:e2b&lt;/CODE&gt;&lt;/td&gt;&lt;td&gt;A100&lt;/td&gt;&lt;td&gt;~184&lt;/td&gt;&lt;td&gt;~9ms&lt;/td&gt;&lt;td&gt;Ultra-fast&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;CODE&gt;gemma4:e4b&lt;/CODE&gt;&lt;/td&gt;&lt;td&gt;A100&lt;/td&gt;&lt;td&gt;~129&lt;/td&gt;&lt;td&gt;~12ms&lt;/td&gt;&lt;td&gt;Great for lighter workloads&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;CODE&gt;gemma4:26b&lt;/CODE&gt;&lt;/td&gt;&lt;td&gt;A100&lt;/td&gt;&lt;td&gt;~113&lt;/td&gt;&lt;td&gt;~14ms&lt;/td&gt;&lt;td&gt;&lt;STRONG&gt;Default A100 choice&lt;/STRONG&gt; — strong reasoning&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;CODE&gt;gemma4:31b&lt;/CODE&gt;&lt;/td&gt;&lt;td&gt;A100&lt;/td&gt;&lt;td&gt;~40&lt;/td&gt;&lt;td&gt;~30ms&lt;/td&gt;&lt;td&gt;Highest quality, slower&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 20.00%" /&gt;&lt;col style="width: 20.00%" /&gt;&lt;col style="width: 20.00%" /&gt;&lt;col style="width: 20.00%" /&gt;&lt;col style="width: 20.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;P&gt;&lt;STRONG&gt;51 tokens/second on a T4 with the 4B model&lt;/STRONG&gt; is fast enough for interactive coding assistance. The 26B model on A100 delivers &lt;STRONG&gt;113 tokens/second&lt;/STRONG&gt; with noticeably better reasoning — ideal for complex refactoring, architecture questions, and multi-file changes.&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;The 26B and 31B models require A100 — they don't fit in T4's 16 GB VRAM.&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;H2&gt;Step 2: Verify Your Endpoint&lt;/H2&gt;
&lt;P&gt;After &lt;CODE&gt;azd up&lt;/CODE&gt; completes, the post-provision hook prints your endpoint URL. Test it:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;curl -u admin:&amp;lt;YOUR_PASSWORD&amp;gt; \
  https://&amp;lt;YOUR_PROXY_ENDPOINT&amp;gt;/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemma4:e4b",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;You should get a JSON response with Gemma 4's reply. The endpoint is fully OpenAI-compatible — it works with any tool or SDK that speaks the OpenAI API format.&lt;/P&gt;
&lt;H2&gt;Step 3: Connect OpenCode&lt;/H2&gt;
&lt;P&gt;Here's where it gets powerful. &lt;A href="https://opencode.ai" target="_blank" rel="noopener"&gt;OpenCode&lt;/A&gt; is a terminal-based AI coding agent — think GitHub Copilot, but running in your terminal and pointing at whatever model backend you choose.&lt;/P&gt;
&lt;P&gt;The &lt;CODE&gt;azd up&lt;/CODE&gt; post-provision hook automatically generates an &lt;CODE&gt;opencode.json&lt;/CODE&gt; in your project directory with the correct endpoint and credentials. If you need to create it manually:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "gemma4-aca": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "Gemma 4 on ACA",
      "options": {
        "baseURL": "https://&amp;lt;YOUR_PROXY_ENDPOINT&amp;gt;/v1",
        "headers": {
          "Authorization": "Basic &amp;lt;BASE64_OF_admin:YOUR_PASSWORD&amp;gt;"
        }
      },
      "models": {
        "gemma4:e4b": {
          "name": "Gemma 4 e4b (4B)"
        }
      }
    }
  }
}&lt;/CODE&gt;&lt;/PRE&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;Generate the Base64 value: &lt;CODE&gt;echo -n "admin:YOUR_PASSWORD" | base64&lt;/CODE&gt;&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Now run it:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;opencode run -m "gemma4-aca/gemma4:e4b" "Write a binary search in Rust"&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;That command sends your prompt to Gemma 4 running on your ACA GPU, and streams the response back to your terminal. &lt;STRONG&gt;Every token is generated on your infrastructure. Nothing leaves your subscription.&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;For interactive sessions, launch the TUI:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;opencode&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Select your model with &lt;CODE&gt;/models&lt;/CODE&gt;, pick Gemma 4, and start coding. OpenCode supports file editing, code generation, refactoring, and multi-turn conversations — all powered by your private Gemma 4 instance.&lt;/P&gt;
&lt;H2&gt;The Privacy Case&lt;/H2&gt;
&lt;P&gt;This matters most for teams that can't send code to external APIs:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;HIPAA-regulated healthcare apps&lt;/STRONG&gt; — patient data in code, schema definitions, and test fixtures stays in your Azure subscription&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Financial services&lt;/STRONG&gt; — proprietary trading algorithms and risk models never leave your network boundary&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Defense and government&lt;/STRONG&gt; — classified or CUI-adjacent codebases get AI assistance without external data processing agreements&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Startups with sensitive IP&lt;/STRONG&gt; — your secret sauce stays secret, even while you use AI to build faster&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;With ACA serverless GPU, you're not running a VM or managing a Kubernetes cluster to get this privacy. It's a managed container with a GPU attached. Azure handles the infrastructure, you own the data boundary.&lt;/P&gt;
&lt;H2&gt;Clean Up&lt;/H2&gt;
&lt;P&gt;When you're done:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;azd down&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;This tears down all Azure resources. Since ACA serverless GPU bills only while your containers are running, you can also scale to zero replicas to pause costs without destroying the environment.&lt;/P&gt;
&lt;H2&gt;Get Started&lt;/H2&gt;
&lt;UL&gt;
&lt;LI&gt;📖 &lt;A href="https://github.com/simonjj/gemma4-on-aca" target="_blank" rel="noopener"&gt;&lt;STRONG&gt;gemma4-on-aca on GitHub&lt;/STRONG&gt;&lt;/A&gt; — clone it, run &lt;CODE&gt;azd up&lt;/CODE&gt;, and you're live&lt;/LI&gt;
&lt;LI&gt;🤖 &lt;A href="https://opencode.ai" target="_blank" rel="noopener"&gt;&lt;STRONG&gt;OpenCode&lt;/STRONG&gt;&lt;/A&gt; — the terminal AI agent that connects to your Gemma 4 endpoint&lt;/LI&gt;
&lt;LI&gt;📌 &lt;A href="https://ai.google.dev/gemma/docs/core" target="_blank" rel="noopener"&gt;&lt;STRONG&gt;Gemma 4 docs&lt;/STRONG&gt;&lt;/A&gt; — model architecture and capabilities&lt;/LI&gt;
&lt;LI&gt;📌 &lt;A href="https://learn.microsoft.com/en-us/azure/container-apps/workload-profiles-overview#gpu-workload-profiles" target="_blank" rel="noopener"&gt;&lt;STRONG&gt;ACA serverless GPU&lt;/STRONG&gt;&lt;/A&gt; — GPU regions and workload profile details&lt;/LI&gt;
&lt;/UL&gt;</description>
      <pubDate>Wed, 15 Apr 2026 16:20:22 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/gemma-4-on-azure-container-apps-serverless-gpu/ba-p/4511671</guid>
      <dc:creator>simonjj</dc:creator>
      <dc:date>2026-04-15T16:20:22Z</dc:date>
    </item>
    <item>
      <title>Govern AI Agents on App Service with the Microsoft Agent Governance Toolkit</title>
      <link>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/govern-ai-agents-on-app-service-with-the-microsoft-agent/ba-p/4510962</link>
      <description>&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;STRONG&gt;Part 3 of 3 — Multi-Agent AI on Azure App Service&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;In &lt;A class="lia-internal-link lia-internal-url lia-internal-url-content-type-blog" href="https://techcommunity.microsoft.com/blog/appsonazureblog/build-multi-agent-ai-apps-on-azure-app-service-with-microsoft-agent-framework-1-/4510017" data-lia-auto-title="Blog 1" data-lia-auto-title-active="0" target="_blank"&gt;Blog 1&lt;/A&gt;, we built a multi-agent travel planner with Microsoft Agent Framework 1.0 on App Service. In &lt;A class="lia-internal-link lia-internal-url lia-internal-url-content-type-blog" href="https://techcommunity.microsoft.com/blog/appsonazureblog/monitor-ai-agents-on-app-service-with-opentelemetry-and-the-new-application-insi/4510023" data-lia-auto-title="Blog 2" data-lia-auto-title-active="0" target="_blank"&gt;Blog 2&lt;/A&gt;, we added observability with OpenTelemetry and the new Application Insights Agents view. Now in Part 3, we secure those agents for production with the &lt;STRONG&gt;Microsoft Agent Governance Toolkit&lt;/STRONG&gt;.&lt;/P&gt;
&lt;P&gt;This post assumes you've followed the guidance in&amp;nbsp;&lt;A class="lia-internal-link lia-internal-url lia-internal-url-content-type-blog" href="https://techcommunity.microsoft.com/blog/appsonazureblog/build-multi-agent-ai-apps-on-azure-app-service-with-microsoft-agent-framework-1-/4510017" data-lia-auto-title="Blog 1" data-lia-auto-title-active="0" target="_blank"&gt;Blog 1&lt;/A&gt; to deploy the multi-agent travel planner to Azure App Service. If you haven't deployed the app yet, start there first.&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;H2&gt;The governance gap&lt;/H2&gt;
&lt;P&gt;Our travel planner works. It's observable. But here's the question I'm hearing from customers: &lt;EM&gt;"How do I make sure my agents don't do something they shouldn't?"&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;It's a fair question. Our six agents — Coordinator, Currency Converter, Weather Advisor, Local Knowledge, Itinerary Planner, and Budget Optimizer — can call external APIs, process user data, and make autonomous decisions. In a demo, that's impressive. In production, that's a risk surface.&lt;/P&gt;
&lt;P&gt;Consider what can go wrong with ungoverned agents:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Unauthorized API calls&lt;/STRONG&gt; — An agent calls an external API it was never intended to use, leaking data or incurring costs&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Sensitive data exposure&lt;/STRONG&gt; — An agent passes PII to a third-party service without consent controls&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Runaway token spend&lt;/STRONG&gt; — A recursive agent loop burns through your OpenAI budget in minutes&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Tool misuse&lt;/STRONG&gt; — A prompt injection tricks an agent into executing a tool it shouldn't&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Cascading failures&lt;/STRONG&gt; — One agent's error propagates through the entire multi-agent workflow&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;These aren't theoretical. In December 2025, &lt;A class="lia-external-url" href="https://genai.owasp.org/resource/owasp-top-10-for-agentic-applications-for-2026/" target="_blank"&gt;OWASP published the Top 10 for Agentic Applications&lt;/A&gt; — the first formal taxonomy of risks specific to autonomous AI agents, including goal hijacking, tool misuse, identity abuse, memory poisoning, and rogue agents. Regulators are paying attention too: the &lt;STRONG&gt;EU AI Act's&lt;/STRONG&gt; high-risk AI obligations take effect in &lt;STRONG&gt;August 2026&lt;/STRONG&gt;, and the &lt;STRONG&gt;Colorado AI Act&lt;/STRONG&gt; becomes enforceable in &lt;STRONG&gt;June 2026&lt;/STRONG&gt;.&lt;/P&gt;
&lt;P&gt;The bottom line: if you're running agents in production, you need governance. Not eventually — now.&lt;/P&gt;
&lt;H2&gt;What the Agent Governance Toolkit does&lt;/H2&gt;
&lt;P&gt;The &lt;A class="lia-external-url" href="https://github.com/microsoft/agent-governance-toolkit" target="_blank"&gt;Agent Governance Toolkit&lt;/A&gt; is an open-source project (MIT license) from Microsoft that brings runtime security governance to autonomous AI agents. It's the first toolkit to address &lt;STRONG&gt;all 10 OWASP agentic AI risks&lt;/STRONG&gt; with deterministic, sub-millisecond policy enforcement.&lt;/P&gt;
&lt;P&gt;The toolkit is organized into &lt;STRONG&gt;7 packages&lt;/STRONG&gt;:&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="width: 100%; border-width: 1px;"&gt;&lt;thead&gt;&lt;tr class="lia-background-color-custom-f0f0f0"&gt;&lt;th&gt;Package&lt;/th&gt;&lt;th&gt;What it does&lt;/th&gt;&lt;th&gt;Think of it as...&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Agent OS&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Stateless policy engine, intercepts every action before execution (&amp;lt;0.1ms p99)&lt;/td&gt;&lt;td&gt;The kernel for AI agents&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Agent Mesh&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Cryptographic identity (DIDs), inter-agent trust protocol, dynamic trust scoring&lt;/td&gt;&lt;td&gt;mTLS for agents&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Agent Runtime&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Execution rings (like CPU privilege levels), saga orchestration, kill switch&lt;/td&gt;&lt;td&gt;Process isolation for agents&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Agent SRE&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;SLOs, error budgets, circuit breakers, chaos engineering&lt;/td&gt;&lt;td&gt;SRE practices for agents&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Agent Compliance&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Automated governance verification, regulatory mapping (EU AI Act, HIPAA, SOC2)&lt;/td&gt;&lt;td&gt;Compliance-as-code&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Agent Marketplace&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Plugin lifecycle management, Ed25519 signing, supply-chain security&lt;/td&gt;&lt;td&gt;Package manager security&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Agent Lightning&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;RL training governance with policy-enforced runners&lt;/td&gt;&lt;td&gt;Safe training guardrails&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 33.33%" /&gt;&lt;col style="width: 33.33%" /&gt;&lt;col style="width: 33.33%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;!-- SCREENSHOT: Agent Governance Toolkit GitHub repo page showing the 7 packages --&gt;
&lt;P&gt;The toolkit is available in &lt;STRONG&gt;Python, TypeScript, Rust, Go, and .NET&lt;/STRONG&gt;. It's framework-agnostic — it works with MAF, LangChain, CrewAI, Google ADK, and more. For our ASP.NET Core travel planner, we'll use the &lt;STRONG&gt;.NET SDK&lt;/STRONG&gt; via NuGet (&lt;CODE&gt;Microsoft.AgentGovernance&lt;/CODE&gt;).&lt;/P&gt;
&lt;P&gt;For this blog, we're focusing on &lt;STRONG&gt;three packages&lt;/STRONG&gt;:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Agent OS&lt;/STRONG&gt; — the policy engine that intercepts and evaluates every agent action&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Agent Compliance&lt;/STRONG&gt; — regulatory mapping and audit trail generation&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Agent SRE&lt;/STRONG&gt; — SLOs and circuit breakers for agent reliability&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2&gt;How easy it was to add governance&lt;/H2&gt;
&lt;P&gt;Here's the part that surprised me. I expected adding governance to a production agent system to be a multi-hour effort — new infrastructure, complex configuration, extensive refactoring. Instead, it took about &lt;STRONG&gt;30 minutes&lt;/STRONG&gt;.&lt;/P&gt;
&lt;P&gt;Here's exactly what we changed:&lt;/P&gt;
&lt;H3&gt;Step 1: Add NuGet packages&lt;/H3&gt;
&lt;P&gt;Three packages added to &lt;CODE&gt;TravelPlanner.Shared.csproj&lt;/CODE&gt;:&lt;/P&gt;
&lt;LI-CODE lang="xml"&gt;&amp;lt;itemgroup&amp;gt; &amp;lt;!-- Existing packages --&amp;gt; &amp;lt;packagereference include="Azure.Monitor.OpenTelemetry.AspNetCore" version="1.3.0"&amp;gt; &amp;lt;packagereference include="Microsoft.Agents.AI" version="1.0.0"&amp;gt; &amp;lt;!-- NEW: Agent Governance Toolkit (single package, all features included) --&amp;gt; &amp;lt;packagereference include="Microsoft.AgentGovernance" version="3.0.2"&amp;gt; &amp;lt;/packagereference&amp;gt;&amp;lt;/packagereference&amp;gt;&amp;lt;/packagereference&amp;gt;&amp;lt;/itemgroup&amp;gt;&lt;/LI-CODE&gt;
&lt;H3&gt;Step 2: Create the policy file&lt;/H3&gt;
&lt;P&gt;One new file: &lt;CODE&gt;governance-policies.yaml&lt;/CODE&gt; in the project root. This is where all your governance rules live:&lt;/P&gt;
&lt;LI-CODE lang="yaml"&gt;apiVersion: governance.toolkit/v1 name: travel-planner-governance description: Policy enforcement for the multi-agent travel planner on App Service scope: global defaultAction: deny rules: - name: allow-currency-conversion condition: "tool == 'ConvertCurrency'" action: allow priority: 10 description: Allow Currency Converter agent to call Frankfurter exchange rate API - name: allow-weather-forecast condition: "tool == 'GetWeatherForecast'" action: allow priority: 10 description: Allow Weather Advisor agent to call NWS forecast API - name: allow-weather-alerts condition: "tool == 'GetWeatherAlerts'" action: allow priority: 10 description: Allow Weather Advisor agent to check NWS weather alerts&lt;/LI-CODE&gt;&lt;!-- SCREENSHOT: The complete governance-policies.yaml file --&gt;
&lt;H3&gt;Step 3: One line in BaseAgent.cs&lt;/H3&gt;
&lt;P&gt;This is the moment. Here's our &lt;CODE&gt;BaseAgent.cs&lt;/CODE&gt; &lt;STRONG&gt;before&lt;/STRONG&gt;:&lt;/P&gt;
&lt;PRE class="language-csharp" tabindex="0" contenteditable="false" data-lia-code-value="Agent = new ChatClientAgent(
    chatClient, instructions: Instructions, 
    name: AgentName, description: Description)
    .AsBuilder()
    .UseOpenTelemetry(sourceName: AgentName)
    .Build();
"&gt;&lt;CODE&gt;Agent = new ChatClientAgent(
    chatClient, instructions: Instructions, 
    name: AgentName, description: Description)
    .AsBuilder()
    .UseOpenTelemetry(sourceName: AgentName)
    .Build();
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;And &lt;STRONG&gt;after&lt;/STRONG&gt;:&lt;/P&gt;
&lt;PRE class="language-csharp" tabindex="0" contenteditable="false" data-lia-code-value="var kernel = serviceProvider.GetService&amp;lt;GovernanceKernel&amp;gt;();
if (kernel is not null)
    builder.UseGovernance(kernel, AgentName);

Agent = builder.Build();
"&gt;&lt;CODE&gt;var kernel = serviceProvider.GetService&amp;lt;GovernanceKernel&amp;gt;();
if (kernel is not null)
    builder.UseGovernance(kernel, AgentName);

Agent = builder.Build();
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&lt;STRONG&gt;One line of intent, two lines of null-safety.&lt;/STRONG&gt; The &lt;CODE&gt;.UseGovernance(kernel, AgentName)&lt;/CODE&gt; call intercepts every tool/function invocation in the agent's pipeline, evaluating it against the loaded policies before execution. If the &lt;CODE&gt;GovernanceKernel&lt;/CODE&gt; isn't registered (governance disabled), agents work exactly as before — no crash, no code change needed.&lt;/P&gt;
&lt;img /&gt;&lt;!-- SCREENSHOT: Side-by-side diff of BaseAgent.cs before/after governance (highlight the one-line change) --&gt;
&lt;P&gt;Here's the full updated constructor using &lt;CODE&gt;IServiceProvider&lt;/CODE&gt; to optionally resolve governance:&lt;/P&gt;
&lt;PRE class="language-csharp" tabindex="0" contenteditable="false" data-lia-code-value="using AgentGovernance;
using Microsoft.Extensions.DependencyInjection;

public abstract class BaseAgent : IAgent
{
    protected readonly ILogger Logger;
    protected readonly AgentOptions Options;
    protected readonly AIAgent Agent;

    // Constructor for simple agents without tools
    protected BaseAgent(
        ILogger logger,
        IOptions&amp;lt;AgentOptions&amp;gt; options,
        IChatClient chatClient,
        IServiceProvider serviceProvider)
    {
        Logger = logger;
        Options = options.Value;

        var builder = new ChatClientAgent(
            chatClient, instructions: Instructions,
            name: AgentName, description: Description)
            .AsBuilder()
            .UseOpenTelemetry(sourceName: AgentName);

        var kernel = serviceProvider.GetService&amp;lt;GovernanceKernel&amp;gt;();
        if (kernel is not null)
            builder.UseGovernance(kernel, AgentName);

        Agent = builder.Build();
    }

    // Constructor for agents with tools
    protected BaseAgent(
        ILogger logger,
        IOptions&amp;lt;AgentOptions&amp;gt; options,
        IChatClient chatClient,
        ChatOptions chatOptions,
        IServiceProvider serviceProvider)
    {
        Logger = logger;
        Options = options.Value;

        var builder = new ChatClientAgent(
            chatClient, instructions: Instructions,
            name: AgentName, description: Description,
            tools: chatOptions.Tools?.ToList())
            .AsBuilder()
            .UseOpenTelemetry(sourceName: AgentName);

        var kernel = serviceProvider.GetService&amp;lt;GovernanceKernel&amp;gt;();
        if (kernel is not null)
            builder.UseGovernance(kernel, AgentName);

        Agent = builder.Build();
    }
    
    // ... rest unchanged
}
"&gt;&lt;CODE&gt;using AgentGovernance;
using Microsoft.Extensions.DependencyInjection;

public abstract class BaseAgent : IAgent
{
    protected readonly ILogger Logger;
    protected readonly AgentOptions Options;
    protected readonly AIAgent Agent;

    // Constructor for simple agents without tools
    protected BaseAgent(
        ILogger logger,
        IOptions&amp;lt;AgentOptions&amp;gt; options,
        IChatClient chatClient,
        IServiceProvider serviceProvider)
    {
        Logger = logger;
        Options = options.Value;

        var builder = new ChatClientAgent(
            chatClient, instructions: Instructions,
            name: AgentName, description: Description)
            .AsBuilder()
            .UseOpenTelemetry(sourceName: AgentName);

        var kernel = serviceProvider.GetService&amp;lt;GovernanceKernel&amp;gt;();
        if (kernel is not null)
            builder.UseGovernance(kernel, AgentName);

        Agent = builder.Build();
    }

    // Constructor for agents with tools
    protected BaseAgent(
        ILogger logger,
        IOptions&amp;lt;AgentOptions&amp;gt; options,
        IChatClient chatClient,
        ChatOptions chatOptions,
        IServiceProvider serviceProvider)
    {
        Logger = logger;
        Options = options.Value;

        var builder = new ChatClientAgent(
            chatClient, instructions: Instructions,
            name: AgentName, description: Description,
            tools: chatOptions.Tools?.ToList())
            .AsBuilder()
            .UseOpenTelemetry(sourceName: AgentName);

        var kernel = serviceProvider.GetService&amp;lt;GovernanceKernel&amp;gt;();
        if (kernel is not null)
            builder.UseGovernance(kernel, AgentName);

        Agent = builder.Build();
    }
    
    // ... rest unchanged
}
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;H3&gt;Step 4: DI registrations in Program.cs&lt;/H3&gt;
&lt;P&gt;A few lines to wire up governance in the dependency injection container:&lt;/P&gt;
&lt;PRE class="language-csharp" tabindex="0" contenteditable="false" data-lia-code-value="using AgentGovernance;

// ... existing builder setup ...

// Configure OpenTelemetry with Azure Monitor (existing — from Blog 2)
builder.Services.AddOpenTelemetry().UseAzureMonitor();

// NEW: Configure Agent Governance Toolkit
// Load policy from YAML, register as singleton. Agents resolve via IServiceProvider.
var policyPath = Path.Combine(builder.Environment.ContentRootPath, &amp;quot;governance-policies.yaml&amp;quot;);
if (File.Exists(policyPath))
{
    try
    {
        var yaml = File.ReadAllText(policyPath);
        var kernel = new GovernanceKernel(new GovernanceOptions 
        { 
            EnableAudit = true, 
            EnableMetrics = true 
        });
        kernel.LoadPolicyFromYaml(yaml);
        builder.Services.AddSingleton(kernel);
        Console.WriteLine($&amp;quot;[Governance] Loaded policies from {policyPath}&amp;quot;);
    }
    catch (Exception ex)
    {
        Console.WriteLine($&amp;quot;[Governance] Failed to load: {ex.Message}. Running without governance.&amp;quot;);
    }
}
"&gt;&lt;CODE&gt;using AgentGovernance;

// ... existing builder setup ...

// Configure OpenTelemetry with Azure Monitor (existing — from Blog 2)
builder.Services.AddOpenTelemetry().UseAzureMonitor();

// NEW: Configure Agent Governance Toolkit
// Load policy from YAML, register as singleton. Agents resolve via IServiceProvider.
var policyPath = Path.Combine(builder.Environment.ContentRootPath, "governance-policies.yaml");
if (File.Exists(policyPath))
{
    try
    {
        var yaml = File.ReadAllText(policyPath);
        var kernel = new GovernanceKernel(new GovernanceOptions 
        { 
            EnableAudit = true, 
            EnableMetrics = true 
        });
        kernel.LoadPolicyFromYaml(yaml);
        builder.Services.AddSingleton(kernel);
        Console.WriteLine($"[Governance] Loaded policies from {policyPath}");
    }
    catch (Exception ex)
    {
        Console.WriteLine($"[Governance] Failed to load: {ex.Message}. Running without governance.");
    }
}
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&lt;STRONG&gt;That's it. Your agents are now governed.&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Let me repeat that because it's the core message of this blog: we added production governance to a six-agent system by adding one NuGet package, creating one YAML policy file, adding a few lines to our base agent class, and registering the governance kernel in DI. No new infrastructure. No complex rewiring. No multi-sprint project. If you followed Blog 1 and Blog 2, you can do this in 30 minutes.&lt;/P&gt;
&lt;H2&gt;Policy flexibility deep-dive&lt;/H2&gt;
&lt;P&gt;The YAML policy language is intentionally simple to start with, but it supports real complexity when you need it. Let's walk through what each policy in our file does.&lt;/P&gt;
&lt;H3&gt;API allowlists and blocklists&lt;/H3&gt;
&lt;P&gt;Our travel planner calls two external APIs: Frankfurter (currency exchange) and the National Weather Service. The &lt;CODE&gt;defaultAction: deny&lt;/CODE&gt; combined with explicit &lt;CODE&gt;allow&lt;/CODE&gt; rules ensures agents can &lt;EM&gt;only&lt;/EM&gt; call these approved tools. If an agent attempts to call any other function — whether through a prompt injection or a bug — the call is blocked before it executes:&lt;/P&gt;
&lt;PRE class="language-yaml" tabindex="0" contenteditable="false" data-lia-code-value="defaultAction: deny
rules:
  - name: allow-currency-conversion
    condition: &amp;quot;tool == 'ConvertCurrency'&amp;quot;
    action: allow
    priority: 10
  - name: allow-weather-forecast
    condition: &amp;quot;tool == 'GetWeatherForecast'&amp;quot;
    action: allow
    priority: 10
"&gt;&lt;CODE&gt;defaultAction: deny
rules:
  - name: allow-currency-conversion
    condition: "tool == 'ConvertCurrency'"
    action: allow
    priority: 10
  - name: allow-weather-forecast
    condition: "tool == 'GetWeatherForecast'"
    action: allow
    priority: 10
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;When a blocked call happens, you'll see output like this in your logs:&lt;/P&gt;
&lt;PRE class="language-text" tabindex="0" contenteditable="false" data-lia-code-value="[Governance] Tool call 'DeleteDatabase' blocked for agent 'LocalKnowledgeAgent': 
  No matching rules; default action is deny.
"&gt;&lt;CODE&gt;[Governance] Tool call 'DeleteDatabase' blocked for agent 'LocalKnowledgeAgent': 
  No matching rules; default action is deny.
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;img /&gt;&lt;!-- SCREENSHOT: Terminal output showing a blocked tool call with policy violation message --&gt;
&lt;H3&gt;Condition language&lt;/H3&gt;
&lt;P&gt;The &lt;CODE&gt;condition&lt;/CODE&gt; field supports equality checks, pattern matching, and boolean logic. You can match on tool name, agent ID, or any key in the evaluation context:&lt;/P&gt;
&lt;PRE class="language-yaml" tabindex="0" contenteditable="false" data-lia-code-value="# Match a specific tool
condition: &amp;quot;tool == 'ConvertCurrency'&amp;quot;

# Match multiple tools with OR
condition: &amp;quot;tool == 'GetWeatherForecast' or tool == 'GetWeatherAlerts'&amp;quot;

# Match by agent
condition: &amp;quot;agent == 'CurrencyConverterAgent' and tool == 'ConvertCurrency'&amp;quot;
"&gt;&lt;CODE&gt;# Match a specific tool
condition: "tool == 'ConvertCurrency'"

# Match multiple tools with OR
condition: "tool == 'GetWeatherForecast' or tool == 'GetWeatherAlerts'"

# Match by agent
condition: "agent == 'CurrencyConverterAgent' and tool == 'ConvertCurrency'"
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;H3&gt;Priority and conflict resolution&lt;/H3&gt;
&lt;P&gt;When multiple rules match, the toolkit evaluates by priority (higher number = higher priority). A deny rule at priority 100 will override an allow rule at priority 10. This lets you layer broad allows with specific denies:&lt;/P&gt;
&lt;PRE class="language-yaml" tabindex="0" contenteditable="false" data-lia-code-value="rules:
  - name: allow-all-weather-tools
    condition: &amp;quot;tool == 'GetWeatherForecast' or tool == 'GetWeatherAlerts'&amp;quot;
    action: allow
    priority: 10
  - name: block-during-maintenance
    condition: &amp;quot;tool == 'GetWeatherForecast'&amp;quot;
    action: deny
    priority: 100
    description: Temporarily block NWS calls during API maintenance
"&gt;&lt;CODE&gt;rules:
  - name: allow-all-weather-tools
    condition: "tool == 'GetWeatherForecast' or tool == 'GetWeatherAlerts'"
    action: allow
    priority: 10
  - name: block-during-maintenance
    condition: "tool == 'GetWeatherForecast'"
    action: deny
    priority: 100
    description: Temporarily block NWS calls during API maintenance
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;H3&gt;Advanced: OPA Rego and Cedar&lt;/H3&gt;
&lt;P&gt;The YAML policy language handles most scenarios, but for teams with advanced needs, the toolkit also supports &lt;STRONG&gt;OPA Rego&lt;/STRONG&gt; and &lt;STRONG&gt;Cedar&lt;/STRONG&gt; policy languages. You can mix them — use YAML for simple rules and Rego for complex conditional logic:&lt;/P&gt;
&lt;PRE class="language-rego" tabindex="0" contenteditable="false" data-lia-code-value="# policies/advanced.rego — Example: time-based access control
package travel_planner.governance

default allow_tool_call = false

allow_tool_call {
    input.agent == &amp;quot;CurrencyConverterAgent&amp;quot;
    input.tool == &amp;quot;get_exchange_rate&amp;quot;
    time.weekday(time.now_ns()) != &amp;quot;Sunday&amp;quot;  # Markets closed
}
"&gt;&lt;CODE&gt;# policies/advanced.rego — Example: time-based access control
package travel_planner.governance

default allow_tool_call = false

allow_tool_call {
    input.agent == "CurrencyConverterAgent"
    input.tool == "get_exchange_rate"
    time.weekday(time.now_ns()) != "Sunday"  # Markets closed
}
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Start simple with YAML. Add complexity only when you need it.&lt;/P&gt;
&lt;H2&gt;Why App Service for governed agent workloads&lt;/H2&gt;
&lt;P&gt;You might be wondering: why does hosting platform matter for governance? It matters a lot. The governance toolkit handles the &lt;EM&gt;application-level&lt;/EM&gt; policies, but a production agent system also needs &lt;EM&gt;platform-level&lt;/EM&gt; security, networking, identity, and deployment controls. App Service gives you these out of the box.&lt;/P&gt;
&lt;H3&gt;Managed Identity&lt;/H3&gt;
&lt;P&gt;Governance policies enforce &lt;EM&gt;what&lt;/EM&gt; agents can access. Managed Identity handles &lt;EM&gt;how&lt;/EM&gt; they authenticate — without secrets to manage, rotate, or leak. Our travel planner already uses &lt;CODE&gt;DefaultAzureCredential&lt;/CODE&gt; for Azure OpenAI, Cosmos DB, and Service Bus. Governance layers on top of this identity foundation.&lt;/P&gt;
&lt;H3&gt;VNet Integration + Private Endpoints&lt;/H3&gt;
&lt;P&gt;The governance toolkit enforces API allowlists at the application level. App Service's &lt;STRONG&gt;VNet integration and private endpoints&lt;/STRONG&gt; enforce network boundaries at the infrastructure level. This is defense in depth: even if a governance policy is misconfigured, the network layer prevents unauthorized egress. Your agents can only reach the networks you've explicitly allowed.&lt;/P&gt;
&lt;H3&gt;Easy Auth&lt;/H3&gt;
&lt;P&gt;App Service's built-in authentication (Easy Auth) protects your agent APIs without custom code. Before a request even reaches your governance engine, App Service has already validated the caller's identity. No custom auth middleware. No JWT parsing. Just toggle it on.&lt;/P&gt;
&lt;H3&gt;Deployment Slots&lt;/H3&gt;
&lt;P&gt;This is underrated for governance. With deployment slots, you can test new governance policies in a &lt;STRONG&gt;staging slot&lt;/STRONG&gt; before swapping to production. Deploy updated &lt;CODE&gt;governance-policies.yaml&lt;/CODE&gt; to staging, run your test suite, verify the policies work as expected, and &lt;EM&gt;then&lt;/EM&gt; swap. Zero-downtime policy updates with full rollback capability.&lt;/P&gt;
&lt;H3&gt;App Insights integration&lt;/H3&gt;
&lt;P&gt;Governance audit events flow into the &lt;STRONG&gt;same Application Insights&lt;/STRONG&gt; instance we configured in Blog 2. This means your governance decisions appear alongside your OTel traces in the Agents view. One pane of glass for agent behavior &lt;EM&gt;and&lt;/EM&gt; governance enforcement.&lt;/P&gt;
&lt;H3&gt;Always-on + WebJobs&lt;/H3&gt;
&lt;P&gt;Our travel planner uses WebJobs for long-running agent workflows. With App Service's Always-on feature, those workflows stay warm, and governance is continuous — no cold-start gaps where agents run unmonitored.&lt;/P&gt;
&lt;H3&gt;azd deployment&lt;/H3&gt;
&lt;P&gt;One command deploys the full governed stack — application code, governance policies, infrastructure, and monitoring:&lt;/P&gt;
&lt;PRE class="language-bash" tabindex="0" contenteditable="false" data-lia-code-value="azd up
"&gt;&lt;CODE&gt;azd up
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;App Service gives you the enterprise production features governance needs — identity, networking, observability, safe deployment — out of the box. The governance toolkit handles agent-level policy enforcement; App Service handles platform-level security. Together, they're a complete governed agent platform.&lt;/P&gt;
&lt;H2&gt;Governance audit events in App Insights&lt;/H2&gt;
&lt;P&gt;In Blog 2, we set up OpenTelemetry and the Application Insights Agents view to monitor agent behavior. With the governance toolkit, those same traces now include &lt;STRONG&gt;governance audit events&lt;/STRONG&gt; — every policy decision is recorded as a span attribute on the agent's trace.&lt;/P&gt;
&lt;!-- SCREENSHOT: App Insights Agents view showing governance audit events alongside OTel traces --&gt;
&lt;P&gt;When you open a trace in the Agents view, you'll see governance events inline:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Policy: api-allowlist → ALLOWED&lt;/STRONG&gt; — CurrencyConverterAgent called Frankfurter API, permitted&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Policy: token-budget → ALLOWED&lt;/STRONG&gt; — Request used 3,200 tokens, within per-request limit of 8,000&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Policy: rate-limit → THROTTLED&lt;/STRONG&gt; — WeatherAdvisorAgent exceeded 60 calls/min, request delayed&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;For deeper analysis, use KQL to query governance events directly. Here's a query that finds all policy violations in the last 24 hours:&lt;/P&gt;
&lt;PRE class="language-sql" tabindex="0" contenteditable="false" data-lia-code-value="// Find all governance policy violations in the last 24 hours
traces
| where timestamp &amp;gt; ago(24h)
| where customDimensions[&amp;quot;governance.decision&amp;quot;] != &amp;quot;ALLOWED&amp;quot;
| extend 
    agentName = tostring(customDimensions[&amp;quot;agent.name&amp;quot;]),
    policyName = tostring(customDimensions[&amp;quot;governance.policy&amp;quot;]),
    decision = tostring(customDimensions[&amp;quot;governance.decision&amp;quot;]),
    violationReason = tostring(customDimensions[&amp;quot;governance.reason&amp;quot;]),
    targetUrl = tostring(customDimensions[&amp;quot;tool.target_url&amp;quot;])
| project timestamp, agentName, policyName, decision, violationReason, targetUrl
| order by timestamp desc
"&gt;&lt;CODE&gt;// Find all governance policy violations in the last 24 hours
traces
| where timestamp &amp;gt; ago(24h)
| where customDimensions["governance.decision"] != "ALLOWED"
| extend 
    agentName = tostring(customDimensions["agent.name"]),
    policyName = tostring(customDimensions["governance.policy"]),
    decision = tostring(customDimensions["governance.decision"]),
    violationReason = tostring(customDimensions["governance.reason"]),
    targetUrl = tostring(customDimensions["tool.target_url"])
| project timestamp, agentName, policyName, decision, violationReason, targetUrl
| order by timestamp desc
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;!-- SCREENSHOT: KQL query for policy violations with results --&gt;
&lt;P&gt;And here's one for tracking token budget consumption across agents:&lt;/P&gt;
&lt;PRE class="language-sql" tabindex="0" contenteditable="false" data-lia-code-value="// Token budget consumption by agent over the last hour
customMetrics
| where timestamp &amp;gt; ago(1h)
| where name == &amp;quot;governance.tokens.consumed&amp;quot;
| extend agentName = tostring(customDimensions[&amp;quot;agent.name&amp;quot;])
| summarize 
    totalTokens = sum(value),
    avgTokensPerRequest = avg(value),
    maxTokensPerRequest = max(value)
    by agentName, bin(timestamp, 5m)
| order by totalTokens desc
"&gt;&lt;CODE&gt;// Token budget consumption by agent over the last hour
customMetrics
| where timestamp &amp;gt; ago(1h)
| where name == "governance.tokens.consumed"
| extend agentName = tostring(customDimensions["agent.name"])
| summarize 
    totalTokens = sum(value),
    avgTokensPerRequest = avg(value),
    maxTokensPerRequest = max(value)
    by agentName, bin(timestamp, 5m)
| order by totalTokens desc
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;This is the power of integrating governance with your existing observability stack. You don't need a separate governance dashboard — everything lives in the same App Insights workspace you already know.&lt;/P&gt;
&lt;H2&gt;SRE for agents&lt;/H2&gt;
&lt;P&gt;The Agent SRE package brings Site Reliability Engineering practices to agent systems. This was the part that got me most excited, because it addresses a question I hear constantly: &lt;EM&gt;"How do I know my agents are actually reliable?"&lt;/EM&gt;&lt;/P&gt;
&lt;H3&gt;Service Level Objectives (SLOs)&lt;/H3&gt;
&lt;P&gt;We defined SLOs in our policy file:&lt;/P&gt;
&lt;PRE class="language-yaml" tabindex="0" contenteditable="false" data-lia-code-value="slos:
  - name: weather-agent-latency
    agent: &amp;quot;WeatherAdvisorAgent&amp;quot;
    metric: latency-p99
    target: 5000ms
    window: 5m
"&gt;&lt;CODE&gt;slos:
  - name: weather-agent-latency
    agent: "WeatherAdvisorAgent"
    metric: latency-p99
    target: 5000ms
    window: 5m
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;This says: "The Weather Advisor Agent must respond within 5 seconds at the 99th percentile, measured over a 5-minute rolling window." When the SLO is breached, the toolkit emits an alert event and can trigger automated responses.&lt;/P&gt;
&lt;H3&gt;Circuit breakers&lt;/H3&gt;
&lt;P&gt;Circuit breakers prevent cascading failures. If an agent fails 5 times in a row, the circuit opens, and subsequent requests get a fast failure response instead of waiting for another timeout:&lt;/P&gt;
&lt;PRE class="language-yaml" tabindex="0" contenteditable="false" data-lia-code-value="circuit-breakers:
  - agent: &amp;quot;*&amp;quot;
    failure-threshold: 5
    recovery-timeout: 30s
    half-open-max-calls: 2
"&gt;&lt;CODE&gt;circuit-breakers:
  - agent: "*"
    failure-threshold: 5
    recovery-timeout: 30s
    half-open-max-calls: 2
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;After 30 seconds, the circuit enters a half-open state, allowing 2 test calls through. If those succeed, the circuit closes and normal operation resumes. If they fail, the circuit opens again. This pattern is battle-tested in microservices — now it protects your agents too.&lt;/P&gt;
&lt;H3&gt;Error budgets&lt;/H3&gt;
&lt;P&gt;Error budgets tie SLOs to business decisions. If your Coordinator Agent's success rate target is 99.5% over a 15-minute window, that means you have an error budget of 0.5%. When the budget is consumed, the toolkit can automatically reduce agent autonomy — for example, requiring human approval for high-risk actions until the error budget recovers.&lt;/P&gt;
&lt;!-- SCREENSHOT: App Insights dashboard showing agent SLO compliance --&gt;
&lt;P&gt;SRE practices turn agent reliability from a hope into a measurable, enforceable contract.&lt;/P&gt;
&lt;H2&gt;Architecture&lt;/H2&gt;
&lt;P&gt;Here's how everything fits together after adding governance:&lt;/P&gt;
&lt;!-- SCREENSHOT: Architecture diagram showing User → Agent → Governance Policy Engine → Approved Actions → External APIs, with App Service features (Managed Identity, VNet, App Insights) called out --&gt;
&lt;PRE class="language-text" tabindex="0" contenteditable="false" data-lia-code-value="
┌─────────────────────────────────────────────────────────────────┐
│                     Azure App Service                           │
│  ┌──────────────┐    ┌─────────────────────────────────────┐    │
│  │   Frontend   │───▶│           ASP.NET Core API          │    │
│  │   (Static)   │    │                                     │    │
│  └──────────────┘    │  ┌─────────────────────────────┐    │    │
│                      │  │     Coordinator Agent       │    │    │
│                      │  │  ┌───────┐  ┌────────────┐  │    │    │
│                      │  │  │ OTel  │─▶│ Governance │  │    │    │
│                      │  │  └───────┘  │   Engine   │  │    │    │
│                      │  │             │ ┌────────┐ │  │    │    │
│                      │  │             │ │Policies│ │  │    │    │
│                      │  │             │ └────────┘ │  │    │    │
│                      │  │             └─────┬──────┘  │    │    │
│                      │  └───────────────────┼─────────┘    │    │
│                      │  ┌───────────────────┼──────────┐   │    │
│                      │  │  Specialist Agents │         │   │    │
│                      │  │  (Currency, Weather, etc.)   │   │    │
│                      │  │  Each with OTel + Governance │   │    │
│                      │  └───────────────────┼──────────┘   │    │
│                      └──────────────────────┼──────────────┘    │
│                                             │                   │
│  ┌────────────┐  ┌───────────┐  ┌───────────┼─────────┐         │
│  │  Managed   │  │   VNet    │  │ App Insights        │         │
│  │  Identity  │  │Integration│  │ (Traces +           │         │
│  │ (no keys)  │  │(network   │  │  Governance Audit)  │         │
│  │            │  │ boundary) │  │                     │         │
│  └────────────┘  └───────────┘  └─────────────────────┘         │
└──────────────────────────────┬──────────────────────────────────┘
                               │ Only allowed APIs
                               ▼
                    ┌──────────────────────┐
                    │   External APIs      │
                    │  ✅ Frankfurter API  │
                    │  ✅ NWS Weather API  │
                    │  ❌ Everything else  │
                    └──────────────────────┘
"&gt;&lt;CODE&gt;
┌─────────────────────────────────────────────────────────────────┐
│                     Azure App Service                           │
│  ┌──────────────┐    ┌─────────────────────────────────────┐    │
│  │   Frontend   │───▶│           ASP.NET Core API          │    │
│  │   (Static)   │    │                                     │    │
│  └──────────────┘    │  ┌─────────────────────────────┐    │    │
│                      │  │     Coordinator Agent       │    │    │
│                      │  │  ┌───────┐  ┌────────────┐  │    │    │
│                      │  │  │ OTel  │─▶│ Governance │  │    │    │
│                      │  │  └───────┘  │   Engine   │  │    │    │
│                      │  │             │ ┌────────┐ │  │    │    │
│                      │  │             │ │Policies│ │  │    │    │
│                      │  │             │ └────────┘ │  │    │    │
│                      │  │             └─────┬──────┘  │    │    │
│                      │  └───────────────────┼─────────┘    │    │
│                      │  ┌───────────────────┼──────────┐   │    │
│                      │  │  Specialist Agents │         │   │    │
│                      │  │  (Currency, Weather, etc.)   │   │    │
│                      │  │  Each with OTel + Governance │   │    │
│                      │  └───────────────────┼──────────┘   │    │
│                      └──────────────────────┼──────────────┘    │
│                                             │                   │
│  ┌────────────┐  ┌───────────┐  ┌───────────┼─────────┐         │
│  │  Managed   │  │   VNet    │  │ App Insights        │         │
│  │  Identity  │  │Integration│  │ (Traces +           │         │
│  │ (no keys)  │  │(network   │  │  Governance Audit)  │         │
│  │            │  │ boundary) │  │                     │         │
│  └────────────┘  └───────────┘  └─────────────────────┘         │
└──────────────────────────────┬──────────────────────────────────┘
                               │ Only allowed APIs
                               ▼
                    ┌──────────────────────┐
                    │   External APIs      │
                    │  ✅ Frankfurter API  │
                    │  ✅ NWS Weather API  │
                    │  ❌ Everything else  │
                    └──────────────────────┘
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;The key insight: governance is a &lt;STRONG&gt;transparent layer&lt;/STRONG&gt; in the agent pipeline. It sits between the agent's decision and the action's execution. The agent code doesn't know or care about governance — it just builds the agent with &lt;CODE&gt;.UseGovernance()&lt;/CODE&gt; and the policy engine handles the rest.&lt;/P&gt;
&lt;H2&gt;Bring it to your own agents&lt;/H2&gt;
&lt;P&gt;We've shown governance with Microsoft Agent Framework on .NET, but the toolkit is &lt;STRONG&gt;framework-agnostic&lt;/STRONG&gt;. Here's how to add it to other popular frameworks:&lt;/P&gt;
&lt;H3&gt;LangChain (Python)&lt;/H3&gt;
&lt;PRE class="language-python" tabindex="0" contenteditable="false" data-lia-code-value="from agent_governance import PolicyEngine, GovernanceCallbackHandler

policy_engine = PolicyEngine.from_yaml(&amp;quot;governance-policies.yaml&amp;quot;)

# Add governance as a LangChain callback handler
agent = create_react_agent(
    llm=llm,
    tools=tools,
    callbacks=[GovernanceCallbackHandler(policy_engine)]
)
"&gt;&lt;CODE&gt;from agent_governance import PolicyEngine, GovernanceCallbackHandler

policy_engine = PolicyEngine.from_yaml("governance-policies.yaml")

# Add governance as a LangChain callback handler
agent = create_react_agent(
    llm=llm,
    tools=tools,
    callbacks=[GovernanceCallbackHandler(policy_engine)]
)
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;H3&gt;CrewAI (Python)&lt;/H3&gt;
&lt;PRE class="language-python" tabindex="0" contenteditable="false" data-lia-code-value="from agent_governance import PolicyEngine
from agent_governance.integrations.crewai import GovernanceTaskDecorator

policy_engine = PolicyEngine.from_yaml(&amp;quot;governance-policies.yaml&amp;quot;)

# Add governance as a CrewAI task decorator
@GovernanceTaskDecorator(policy_engine)
def research_task(agent, context):
    return agent.execute(context)
"&gt;&lt;CODE&gt;from agent_governance import PolicyEngine
from agent_governance.integrations.crewai import GovernanceTaskDecorator

policy_engine = PolicyEngine.from_yaml("governance-policies.yaml")

# Add governance as a CrewAI task decorator
@GovernanceTaskDecorator(policy_engine)
def research_task(agent, context):
    return agent.execute(context)
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;H3&gt;Google ADK (Python)&lt;/H3&gt;
&lt;PRE class="language-python" tabindex="0" contenteditable="false" data-lia-code-value="from agent_governance import PolicyEngine
from agent_governance.integrations.google_adk import GovernancePlugin

policy_engine = PolicyEngine.from_yaml(&amp;quot;governance-policies.yaml&amp;quot;)

# Add governance as a Google ADK plugin
agent = Agent(
    model=&amp;quot;gemini-2.0-flash&amp;quot;,
    tools=[...],
    plugins=[GovernancePlugin(policy_engine)]
)
"&gt;&lt;CODE&gt;from agent_governance import PolicyEngine
from agent_governance.integrations.google_adk import GovernancePlugin

policy_engine = PolicyEngine.from_yaml("governance-policies.yaml")

# Add governance as a Google ADK plugin
agent = Agent(
    model="gemini-2.0-flash",
    tools=[...],
    plugins=[GovernancePlugin(policy_engine)]
)
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;H3&gt;TypeScript / Node.js&lt;/H3&gt;
&lt;PRE class="language-typescript" tabindex="0" contenteditable="false" data-lia-code-value="import { PolicyEngine } from '@microsoft/agentmesh-sdk';

const policyEngine = PolicyEngine.fromYaml('governance-policies.yaml');

// Use as middleware in your agent pipeline
agent.use(policyEngine.middleware());
"&gt;&lt;CODE&gt;import { PolicyEngine } from '@microsoft/agentmesh-sdk';

const policyEngine = PolicyEngine.fromYaml('governance-policies.yaml');

// Use as middleware in your agent pipeline
agent.use(policyEngine.middleware());
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Every integration hooks into the framework's native extension points — callbacks, decorators, plugins, middleware — so adding governance doesn't require rewriting your agent code. Install the package, point it at your policy file, and you're governed.&lt;/P&gt;
&lt;H2&gt;What's next&lt;/H2&gt;
&lt;P&gt;This wraps up our three-part series on building production-ready multi-agent AI applications on Azure App Service:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;&lt;STRONG&gt;&lt;A class="lia-internal-link lia-internal-url lia-internal-url-content-type-blog" href="https://techcommunity.microsoft.com/blog/appsonazureblog/build-multi-agent-ai-apps-on-azure-app-service-with-microsoft-agent-framework-1-/4510017" data-lia-auto-title="Blog 1: Build" data-lia-auto-title-active="0" target="_blank"&gt;Blog 1: Build&lt;/A&gt;&lt;/STRONG&gt; — Deploy a multi-agent travel planner with Microsoft Agent Framework 1.0&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;&lt;A class="lia-internal-link lia-internal-url lia-internal-url-content-type-blog" href="https://techcommunity.microsoft.com/blog/appsonazureblog/monitor-ai-agents-on-app-service-with-opentelemetry-and-the-new-application-insi/4510023" data-lia-auto-title="Blog 2: Monitor" data-lia-auto-title-active="0" target="_blank"&gt;Blog 2: Monitor&lt;/A&gt;&lt;/STRONG&gt; — Add observability with OpenTelemetry and the Application Insights Agents view&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Blog 3: Govern&lt;/STRONG&gt; — Secure agents for production with the Agent Governance Toolkit (you are here)&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;The progression is intentional: first make it work, then make it visible, then make it safe. And the consistent theme across all three parts is that &lt;STRONG&gt;App Service makes each step easier&lt;/STRONG&gt; — managed hosting for Blog 1, integrated monitoring for Blog 2, and platform-level security features for Blog 3.&lt;/P&gt;
&lt;H3&gt;Next steps for your agents&lt;/H3&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;&lt;A class="lia-external-url" href="https://github.com/microsoft/agent-governance-toolkit" target="_blank"&gt;Explore the Agent Governance Toolkit&lt;/A&gt;&lt;/STRONG&gt; — star the repo, browse the 20 tutorials, try the demo&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Customize policies for your compliance needs&lt;/STRONG&gt; — start with our YAML template and adapt it to your domain. Healthcare teams: enable HIPAA mappings. Finance teams: add SOC2 controls.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Explore Agent Mesh for multi-agent trust&lt;/STRONG&gt; — if you have agents communicating across services or trust boundaries, Agent Mesh's cryptographic identity and trust scoring add another layer of defense&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Deploy the sample&lt;/STRONG&gt; — clone our &lt;A class="lia-external-url" href="https://github.com/Azure-Samples/app-service-agent-otel" target="_blank"&gt;travel planner repo&lt;/A&gt;, run &lt;CODE&gt;azd up&lt;/CODE&gt;, and see governed agents in action&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;AI agents are becoming autonomous decision-makers in high-stakes domains. The question isn't &lt;EM&gt;whether&lt;/EM&gt; we need governance — it's whether we build it proactively, before incidents force our hand. With the Agent Governance Toolkit and Azure App Service, you can add production governance to your agents today. In about 30 minutes.&lt;/P&gt;</description>
      <pubDate>Mon, 13 Apr 2026 21:15:44 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/govern-ai-agents-on-app-service-with-the-microsoft-agent/ba-p/4510962</guid>
      <dc:creator>jordanselig</dc:creator>
      <dc:date>2026-04-13T21:15:44Z</dc:date>
    </item>
    <item>
      <title>Introducing Wildcard Roles in Azure Web PubSub: simpler, smarter permissions for real-time apps</title>
      <link>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/introducing-wildcard-roles-in-azure-web-pubsub-simpler-smarter/ba-p/4509524</link>
      <description>&lt;P&gt;Real-time interactivity is now a baseline expectation across industries, from collaborative dashboards to trading platforms, IoT monitoring, and live data visualizations. Developers need a way to broadcast data instantly to connected clients without worrying about connection management, scaling, or infrastructure.&lt;/P&gt;
&lt;P&gt;That’s where &lt;A class="lia-external-url" href="https://learn.microsoft.com/azure/azure-web-pubsub/overview" target="_blank" rel="noopener"&gt;Azure Web PubSub&lt;/A&gt; comes in. It provides a fully managed service that enables real-time messaging over WebSocket. Your applications can send and receive live updates instantly, without managing servers or message fan-out manually.&lt;/P&gt;
&lt;P&gt;Now Azure Web PubSub is introducing a new capability that makes permission management simpler and more scalable: &lt;STRONG&gt;using wildcard pattern to define client permissions in groups&lt;/STRONG&gt;.&lt;/P&gt;
&lt;H2&gt;Understanding Azure Web PubSub&lt;/H2&gt;
&lt;img&gt;Azure Web PubSub architecture showing client connections and message flow between backend and clients through the service. The dashed line indicates stateless APIs call, while the solid line indicates persistent, long-lived WebSocket connections.&lt;/img&gt;
&lt;P&gt;Azure Web PubSub allows you to add real-time capabilities to your app. At a high level:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;Your backend generates a client access token&amp;nbsp;and hands it to a connecting client.&lt;/LI&gt;
&lt;LI&gt;The client connects to Azure Web PubSub over WebSocket using that token.&lt;/LI&gt;
&lt;LI&gt;Once connected, both the backend and client can send and receive messages through the service.&lt;/LI&gt;
&lt;LI&gt;Clients can be organized into&amp;nbsp;groups (for example, all clients in the same trading room or dashboard).&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;Groups allow you to target specific audiences efficiently: sending messages to all users in `dashboard.operations` or receiving updates from `market.NASDAQ.MSFT`.&lt;/P&gt;
&lt;P&gt;To maintain security, every token defines a set of &lt;STRONG&gt;roles&lt;/STRONG&gt; that specify what the client can do. The code illustrates what your backend needs to specify when generating a client access token for the scenarios mentioned.&lt;/P&gt;
&lt;LI-CODE lang="javascript"&gt;// Arguments omitted for simplicity
const WebPubSubServiceClient = new WebPubSubServiceClient();
WebPubSubServiceClient.getClientAccessToken({ 
roles: [ 
"webpubsub.joinLeaveGroup.dashboard.operations",
"webpubsub.sendToGroups.dashboard.operations",
"webpubsub.sendToGroups.market.NASDAQ.MSFT",
],
});&lt;/LI-CODE&gt;
&lt;H2&gt;The Current Permission Model: literal roles&lt;/H2&gt;
&lt;P&gt;Until now, Azure Web PubSub used &lt;STRONG&gt;literal group roles&lt;/STRONG&gt; to define client permissions precisely.&lt;/P&gt;
&lt;P&gt;For example:&lt;/P&gt;
&lt;LI-CODE lang="javascript"&gt;roles: ["webpubsub.joinLeaveGroup.room123", "webpubsub.sendToGroup.room123"];&lt;/LI-CODE&gt;
&lt;P&gt;These roles are clear and secure: the client can only join and send to a specific group, `room123`.&lt;/P&gt;
&lt;P&gt;However, as your application scales and dynamically creates many groups — for example, hundreds of trading accounts, projects, or classrooms — issuing one role per group becomes cumbersome. Your backend needs to:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Track all the groups a user is authorized for&lt;/LI&gt;
&lt;LI&gt;Generate large tokens containing many role strings&lt;/LI&gt;
&lt;LI&gt;Refresh those tokens every time group access changes&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2&gt;The New Capability: wildcard patterns for group roles&lt;/H2&gt;
&lt;P&gt;Wildcard roles let you express permissions using patterns instead of individual hardcoded names. With a single role, you can authorize access to many related groups.&lt;/P&gt;
&lt;P&gt;For example:&lt;/P&gt;
&lt;LI-CODE lang="javascript"&gt;roles: ["webpubsub.joinLeaveGroups.room.*", "webpubsub.sendToGroups.room.*"];&lt;/LI-CODE&gt;
&lt;P&gt;This allows the client to join or send messages to any group whose name starts with `room:`.&lt;/P&gt;
&lt;H2&gt;Real-World Examples&lt;/H2&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="width: 100%; border-width: 1px;"&gt;&lt;colgroup&gt;&lt;col style="width: 33.3333%" /&gt;&lt;col style="width: 33.3333%" /&gt;&lt;col style="width: 33.3333%" /&gt;&lt;/colgroup&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Industry&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;&lt;STRONG&gt;Example&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;&lt;STRONG&gt;Benefit of wildcard roles&lt;/STRONG&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Finance&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Risk monitoring bots subscribing to all trading accounts&lt;/td&gt;&lt;td&gt;One role covers `account:*` groups&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Gaming&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Matchmaking service observing all `lobby:*` rooms&lt;/td&gt;&lt;td&gt;Simplifies admin tools&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Education&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Teacher dashboard viewing all `class:*` groups&lt;/td&gt;&lt;td&gt;Fewer roles, easier permission management&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Collaboration&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Logging all messages across `project:*` for auditing purpose&lt;/td&gt;&lt;td&gt;Centralized monitoring without large tokens&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;P&gt;&lt;A class="lia-external-url" href="https://learn.microsoft.com/azure/azure-web-pubsub/concept-wildcard-group-roles" target="_blank" rel="noopener"&gt;Read the documentation for all supported wildcard patterns&lt;/A&gt;.&lt;/P&gt;
&lt;H2&gt;Why This Matters for Developers&lt;/H2&gt;
&lt;P&gt;Wildcard roles simplify the permission model for dynamic or large-scale systems:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Simpler token management&lt;/STRONG&gt;&amp;nbsp;– You no longer need to issue or refresh tokens every time a client’s group list changes.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Smaller tokens&lt;/STRONG&gt; – One pattern replaces many literal roles, reducing token size.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Dynamic authorization&lt;/STRONG&gt;&amp;nbsp;– When permissions for a client change (for example, they’re assigned to new groups that match existing patterns), there’s no need to regenerate tokens.&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2&gt;Deep Dive: financial trading platform&lt;/H2&gt;
&lt;P&gt;Let’s look at how wildcard roles can simplify real-time event management in a trading platform where financial assets, like stocks, are traded. &lt;A class="lia-external-url" href="https://github.com/Azure/azure-webpubsub/tree/main/samples/javascript/wildcard-trading" target="_blank" rel="noopener"&gt;See the complete code here.&lt;/A&gt;&amp;nbsp;&lt;/P&gt;
&lt;H3&gt;The Setup&lt;/H3&gt;
&lt;P&gt;The platform includes:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;A &lt;STRONG&gt;trading dashboard&lt;/STRONG&gt;&amp;nbsp;for managers of a trading team&lt;/LI&gt;
&lt;LI&gt;A &lt;STRONG&gt;trading dashboard&lt;/STRONG&gt;&amp;nbsp;for human traders on a trading team&lt;/LI&gt;
&lt;LI&gt;Two risk analysis bots:
&lt;UL&gt;
&lt;LI&gt;A &lt;STRONG&gt;hardcoded risk bot&lt;/STRONG&gt;&amp;nbsp;that applies strict predefined rules&lt;/LI&gt;
&lt;LI&gt;An&amp;nbsp;&lt;STRONG&gt;LLM-based risk bot&lt;/STRONG&gt; that uses AI to detect unusual behavior&lt;/LI&gt;
&lt;/UL&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;On the platform, each trading account managed by one or more human traders, is given its own Web PubSub group:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;`account.1234.trades` – Trade updates&lt;/LI&gt;
&lt;LI&gt;`account.1234.orders` – Order events&lt;/LI&gt;
&lt;LI&gt;`market.NYSE` – Market data&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;The backend publishes events to these groups whenever a new order or trade occurs and clients that subscribe to these groups can receive real-time data.&lt;/P&gt;
&lt;H3&gt;Before: literal roles&lt;/H3&gt;
&lt;P&gt;Previously, each risk bot would need literal roles for every account:&lt;/P&gt;
&lt;LI-CODE lang="javascript"&gt;roles: [
"webpubsub.joinLeaveGroup.account.1234.trades",
"webpubsub.joinLeaveGroup.account.5678.trades",
"webpubsub.joinLeaveGroup.account.9012.orders",
];&lt;/LI-CODE&gt;
&lt;P&gt;If new accounts were created, new tokens had to be issued to include their roles.&lt;/P&gt;
&lt;H3&gt;Now: wildcard roles&lt;/H3&gt;
&lt;P&gt;With the new feature, each risk bot can receive a single, compact token:&lt;/P&gt;
&lt;LI-CODE lang="javascript"&gt;roles: [
"webpubsub.joinLeaveGroups.account.*",
"webpubsub.joinLeaveGroups.market.*",
];&lt;/LI-CODE&gt;
&lt;P&gt;Now, the bot automatically gains access to all existing and &lt;STRONG&gt;future&lt;/STRONG&gt;&amp;nbsp;account and market groups matching those patterns, &lt;STRONG&gt;without any token regeneration&lt;/STRONG&gt;.&lt;/P&gt;
&lt;H2&gt;How the Risk Bots Work&lt;/H2&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="width: 100%; height: 140px; border-width: 1px;"&gt;&lt;colgroup&gt;&lt;col style="width: 50%" /&gt;&lt;col style="width: 50%" /&gt;&lt;/colgroup&gt;&lt;tbody&gt;&lt;tr style="height: 35px;"&gt;&lt;td style="height: 35px;"&gt;&lt;STRONG&gt;Component&lt;/STRONG&gt;&lt;/td&gt;&lt;td style="height: 35px;"&gt;&lt;STRONG&gt;Behavior&lt;/STRONG&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style="height: 35px;"&gt;&lt;td style="height: 35px;"&gt;&lt;STRONG&gt;Hardcoded risk bot&lt;/STRONG&gt;&lt;/td&gt;&lt;td style="height: 35px;"&gt;Implements deterministic rules: e.g., if position size &amp;gt; 100 of a company's stock, trigger alert.&lt;/td&gt;&lt;/tr&gt;&lt;tr style="height: 35px;"&gt;&lt;td style="height: 35px;"&gt;&lt;STRONG&gt;LLM risk bot&lt;/STRONG&gt;&lt;/td&gt;&lt;td style="height: 35px;"&gt;Uses AI models to identify anomalies, fraudulent behavior, or market stress.&lt;/td&gt;&lt;/tr&gt;&lt;tr style="height: 35px;"&gt;&lt;td style="height: 35px;"&gt;&lt;STRONG&gt;Backend publisher&lt;/STRONG&gt;&lt;/td&gt;&lt;td style="height: 35px;"&gt;Emits order and trade events to `account:*` and `market:*` groups.&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;P&gt;When a trade event is published:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;Both bots receive it in real time through wildcard subscriptions.&lt;/LI&gt;
&lt;LI&gt;Each evaluates the event differently.&lt;/LI&gt;
&lt;LI&gt;If a risk is detected, they publish an alert to `alerts.risk.*`.&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;&lt;STRONG&gt;Traders&lt;/STRONG&gt; still receive messages for only their specific account group — using literal roles to ensure isolation:&lt;/P&gt;
&lt;LI-CODE lang="javascript"&gt;roles: ["webpubsub.joinLeaveGroup.account.1234.trades"];&lt;/LI-CODE&gt;
&lt;P&gt;This demonstrates a clean separation:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Automation and monitoring&lt;/STRONG&gt; use wildcard roles for flexibility.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;End users&lt;/STRONG&gt; use literal roles for strict access control.&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2&gt;Developer Experience: cleaner and more scalable&lt;/H2&gt;
&lt;P&gt;With wildcard roles, developers can design real-time architectures that are both expressive and efficient:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Simplified token issuance&lt;/LI&gt;
&lt;LI&gt;Reduced backend logic for permission changes&lt;/LI&gt;
&lt;LI&gt;Better scalability for dynamic environments&lt;/LI&gt;
&lt;LI&gt;Flexible system-level actors (bots, dashboards, monitors)&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Together, these improvements reduce operational complexity while keeping access control transparent and secure. Whether you’re building a trading platform, a game server, or a collaborative dashboard, this new capability helps you scale real-time systems with less friction and more control.&lt;/P&gt;</description>
      <pubDate>Mon, 13 Apr 2026 07:11:33 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/introducing-wildcard-roles-in-azure-web-pubsub-simpler-smarter/ba-p/4509524</guid>
      <dc:creator>kevinguo</dc:creator>
      <dc:date>2026-04-13T07:11:33Z</dc:date>
    </item>
    <item>
      <title>PHP 8.5 is now available on Azure App Service for Linux</title>
      <link>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/php-8-5-is-now-available-on-azure-app-service-for-linux/ba-p/4510254</link>
      <description>&lt;P&gt;PHP 8.5 is now available on Azure App Service for Linux across all public regions. You can create a new PHP 8.5 app through the Azure portal, automate it with the Azure CLI, or deploy using ARM/Bicep templates.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;PHP 8.5 brings several useful runtime improvements. It includes&amp;nbsp;&lt;STRONG&gt;better diagnostics&lt;/STRONG&gt;, with fatal errors now providing a backtrace, which can make troubleshooting easier. It also adds the&amp;nbsp;&lt;STRONG&gt;pipe operator (|&amp;gt;)&lt;/STRONG&gt;&amp;nbsp;for cleaner, more readable code, along with broader improvements in syntax, performance, and type safety. You can take advantage of these improvements while continuing to use the deployment and management experience you already know in App Service.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;For the full list of features, deprecations, and migration notes, see the official PHP 8.5 release page:&amp;nbsp;&lt;A class="lia-external-url" href="https://www.php.net/releases/8.5/en.php" target="_blank"&gt;https://www.php.net/releases/8.5/en.php&lt;/A&gt;&lt;/P&gt;
&lt;H3&gt;Getting started&lt;/H3&gt;
&lt;OL&gt;
&lt;LI&gt;&lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/azure/app-service/quickstart-php?tabs=cli&amp;amp;pivots=platform-linux" target="_blank"&gt;Create a PHP web app in Azure App Service&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/azure/app-service/configure-language-php?pivots=platform-linux" target="_blank"&gt;Configure a PHP app for Azure App Service&lt;/A&gt;&lt;/LI&gt;
&lt;/OL&gt;</description>
      <pubDate>Fri, 10 Apr 2026 10:11:11 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/php-8-5-is-now-available-on-azure-app-service-for-linux/ba-p/4510254</guid>
      <dc:creator>TulikaC</dc:creator>
      <dc:date>2026-04-10T10:11:11Z</dc:date>
    </item>
    <item>
      <title>A simpler way to deploy your code to Azure App Service for Linux</title>
      <link>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/a-simpler-way-to-deploy-your-code-to-azure-app-service-for-linux/ba-p/4510240</link>
      <description>&lt;P&gt;We’ve added a new deployment experience for Azure App Service for Linux that makes it easier to get your code running on your web app.&lt;/P&gt;
&lt;P&gt;To get started, go to the Kudu/SCM site for your app:&lt;/P&gt;
&lt;LI-CODE lang="bash"&gt;&amp;lt;sitename&amp;gt;.scm.azurewebsites.net&lt;/LI-CODE&gt;
&lt;P&gt;From there, open the new&amp;nbsp;&lt;STRONG&gt;Deployments&lt;/STRONG&gt; experience.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;You can now deploy your app by simply dragging and dropping a zip file containing your code. Once your file is uploaded, App Service shows you the contents of the zip so you can quickly verify what you’re about to deploy.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;If your application is already built and ready to run, you also have the option to&amp;nbsp;&lt;STRONG&gt;skip server-side build&lt;/STRONG&gt;. Otherwise, App Service can handle the build step for you.&lt;/P&gt;
&lt;P&gt;When you’re ready, select&amp;nbsp;&lt;STRONG&gt;Deploy&lt;/STRONG&gt;.&lt;/P&gt;
&lt;P&gt;From there, the deployment starts right away, and you can follow each phase of the process as it happens. The experience shows clear progress through upload, build, and deployment, along with deployment logs to help you understand what’s happening behind the scenes.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;After the deployment succeeds, you can also view&amp;nbsp;&lt;STRONG&gt;runtime logs&lt;/STRONG&gt;, which makes it easier to confirm that your app has started successfully.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;This experience is ideal if you’re getting started with Azure App Service and want the quickest path from code to a running app. For production workloads and teams with established release processes, you’ll typically continue using an automated CI/CD pipeline (for example, GitHub Actions or Azure DevOps) for repeatable deployments.&lt;/P&gt;
&lt;P&gt;We’re continuing to improve the developer experience on App Service for Linux. Give it a try and let us know what you think.&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 10 Apr 2026 09:48:40 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/a-simpler-way-to-deploy-your-code-to-azure-app-service-for-linux/ba-p/4510240</guid>
      <dc:creator>TulikaC</dc:creator>
      <dc:date>2026-04-10T09:48:40Z</dc:date>
    </item>
    <item>
      <title>Monitor AI Agents on App Service with OpenTelemetry and the New Application Insights Agents View</title>
      <link>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/monitor-ai-agents-on-app-service-with-opentelemetry-and-the-new/ba-p/4510023</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;STRONG&gt;Part 2 of 3:&lt;/STRONG&gt; In &lt;A class="lia-internal-link lia-internal-url lia-internal-url-content-type-blog" href="https://techcommunity.microsoft.com/blog/appsonazureblog/build-multi-agent-ai-apps-on-azure-app-service-with-microsoft-agent-framework-1-/4510017" target="_blank" rel="noopener" data-lia-auto-title="Blog 1" data-lia-auto-title-active="0"&gt;Blog 1&lt;/A&gt;, we deployed a multi-agent travel planner on Azure App Service using the Microsoft Agent Framework (MAF) 1.0 GA. This post dives deep into how we instrumented those agents with OpenTelemetry and lit up the brand-new &lt;STRONG&gt;Agents (Preview)&lt;/STRONG&gt; view in Application Insights.&lt;/BLOCKQUOTE&gt;
&lt;BLOCKQUOTE&gt;&lt;STRONG&gt;📋 Prerequisite:&lt;/STRONG&gt; This post assumes you've followed the guidance in &lt;A class="lia-internal-link lia-internal-url lia-internal-url-content-type-blog" href="https://techcommunity.microsoft.com/blog/appsonazureblog/build-multi-agent-ai-apps-on-azure-app-service-with-microsoft-agent-framework-1-/4510017" target="_blank" rel="noopener" data-lia-auto-title="Blog 1" data-lia-auto-title-active="0"&gt;Blog 1&lt;/A&gt; to deploy the multi-agent travel planner to Azure App Service. If you haven't deployed the app yet, start there first — you'll need a running App Service with the agents, Service Bus, Cosmos DB, and Azure OpenAI provisioned before the monitoring steps in this post will work.&lt;/BLOCKQUOTE&gt;
&lt;!-- SCREENSHOT: Banner image of the Agents (Preview) view in Application Insights showing the travel planner agents --&gt;
&lt;H2&gt;Deploying Agents Is Only Half the Battle&lt;/H2&gt;
&lt;P&gt;In Blog 1, we walked through deploying a multi-agent travel planning application on Azure App Service. Six specialized agents — a Coordinator, Currency Converter, Weather Advisor, Local Knowledge Expert, Itinerary Planner, and Budget Optimizer — work together to generate comprehensive travel plans. The architecture uses an ASP.NET Core API backed by a WebJob for async processing, Azure Service Bus for messaging, and Azure OpenAI for the brains.&lt;/P&gt;
&lt;P&gt;But here's the thing: deploying agents to production is only half the battle. Once they're running, you need answers to questions like:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Which agent is consuming the most tokens?&lt;/LI&gt;
&lt;LI&gt;How long does the Itinerary Planner take compared to the Weather Advisor?&lt;/LI&gt;
&lt;LI&gt;Is the Coordinator making too many LLM calls per workflow?&lt;/LI&gt;
&lt;LI&gt;When something goes wrong, which agent in the pipeline failed?&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Traditional APM gives you HTTP latencies and exception rates. That's table stakes. For AI agents, you need to see &lt;EM&gt;inside the agent&lt;/EM&gt; — the model calls, the tool invocations, the token spend. And that's exactly what Application Insights' new &lt;STRONG&gt;Agents (Preview)&lt;/STRONG&gt; view delivers, powered by OpenTelemetry and the GenAI semantic conventions.&lt;/P&gt;
&lt;P&gt;Let's break down how it all works.&lt;/P&gt;
&lt;H2&gt;The Agents (Preview) View in Application Insights&lt;/H2&gt;
&lt;P&gt;Azure Application Insights now includes a dedicated&amp;nbsp;&lt;STRONG&gt;Agents (Preview)&lt;/STRONG&gt; blade that provides unified monitoring purpose-built for AI agents. It's not just a generic dashboard — it understands agent concepts natively. Whether your agents are built with Microsoft Agent Framework, Azure AI Foundry, Copilot Studio, or a third-party framework, this view lights up as long as your telemetry follows the &lt;A class="lia-external-url" href="https://opentelemetry.io/docs/specs/semconv/registry/attributes/gen-ai/" target="_blank" rel="noopener"&gt;GenAI semantic conventions&lt;/A&gt;.&lt;/P&gt;
&lt;P&gt;Here's what you get out of the box:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Agent dropdown filter&lt;/STRONG&gt; — A dropdown populated by &lt;CODE&gt;gen_ai.agent.name&lt;/CODE&gt; values from your telemetry. In our travel planner, this shows all six agents: "Travel Planning Coordinator", "Currency Conversion Specialist", "Weather &amp;amp; Packing Advisor", "Local Expert &amp;amp; Cultural Guide", "Itinerary Planning Expert", and "Budget Optimization Specialist". You can filter the entire dashboard to one agent or view them all.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Token usage metrics&lt;/STRONG&gt; — Visualizations of input and output token consumption, broken down by agent. Instantly see which agents are the most expensive to run.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Operational metrics&lt;/STRONG&gt; — Latency distributions, error rates, and throughput for each agent. Spot performance regressions before users notice.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;End-to-end transaction details&lt;/STRONG&gt; — Click into any trace to see the full workflow: which agents were invoked, what tools they called, how long each step took. The "simple view" renders agent steps in a story-like format that's remarkably easy to follow.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Grafana integration&lt;/STRONG&gt; — One-click export to Azure Managed Grafana for custom dashboards and alerting.&lt;/LI&gt;
&lt;/UL&gt;
&lt;!-- SCREENSHOT: The Agents (Preview) view main dashboard showing token usage, operational metrics, and the agent dropdown --&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;!-- SCREENSHOT: Agent dropdown showing all 6 agents: Travel Planning Coordinator, Currency Conversion Specialist, Weather &amp; Packing Advisor, Local Expert &amp; Cultural Guide, Itinerary Planning Expert, Budget Optimization Specialist --&gt;&lt;img /&gt;
&lt;P&gt;The key insight: this view isn't magic. It works because the telemetry is structured using well-defined semantic conventions. Let's look at those next.&lt;/P&gt;
&lt;BLOCKQUOTE&gt;&lt;STRONG&gt;📖 Docs:&lt;/STRONG&gt; &lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/azure/azure-monitor/app/agents-view" target="_blank" rel="noopener"&gt;Application Insights Agents (Preview) view documentation&lt;/A&gt;&lt;/BLOCKQUOTE&gt;
&lt;H2&gt;GenAI Semantic Conventions — The Foundation&lt;/H2&gt;
&lt;P&gt;The entire Agents view is powered by the &lt;A class="lia-external-url" href="https://opentelemetry.io/docs/specs/semconv/registry/attributes/gen-ai/" target="_blank" rel="noopener"&gt;OpenTelemetry GenAI semantic conventions&lt;/A&gt;. These are a standardized set of span attributes that describe AI agent behavior in a way that any observability backend can understand. Think of them as the "contract" between your instrumented code and Application Insights.&lt;/P&gt;
&lt;P&gt;Let's walk through the key attributes and why each one matters:&lt;/P&gt;
&lt;H3&gt;&lt;CODE&gt;gen_ai.agent.name&lt;/CODE&gt;&lt;/H3&gt;
&lt;P&gt;This is the human-readable name of the agent. In our travel planner, each agent sets this via the &lt;CODE&gt;name&lt;/CODE&gt; parameter when constructing the MAF &lt;CODE&gt;ChatClientAgent&lt;/CODE&gt; — for example, &lt;CODE&gt;"Weather &amp;amp; Packing Advisor"&lt;/CODE&gt; or &lt;CODE&gt;"Budget Optimization Specialist"&lt;/CODE&gt;. This is what populates the agent dropdown in the Agents view. Without this attribute, Application Insights would have no way to distinguish one agent from another in your telemetry. It's the single most important attribute for agent-level monitoring.&lt;/P&gt;
&lt;H3&gt;&lt;CODE&gt;gen_ai.agent.description&lt;/CODE&gt;&lt;/H3&gt;
&lt;P&gt;A brief description of what the agent does. Our Weather Advisor, for example, is described as &lt;EM&gt;"Provides weather forecasts, packing recommendations, and activity suggestions based on destination weather conditions."&lt;/EM&gt; This metadata helps operators and on-call engineers quickly understand an agent's role without diving into source code. It shows up in trace details and helps contextualize what you're looking at when debugging.&lt;/P&gt;
&lt;H3&gt;&lt;CODE&gt;gen_ai.agent.id&lt;/CODE&gt;&lt;/H3&gt;
&lt;P&gt;A unique identifier for the agent instance. In MAF, this is typically an auto-generated GUID. While &lt;CODE&gt;gen_ai.agent.name&lt;/CODE&gt; is the human-friendly label, &lt;CODE&gt;gen_ai.agent.id&lt;/CODE&gt; is the machine-stable identifier. If you rename an agent, the ID stays the same, which is important for tracking agent behavior across code deployments.&lt;/P&gt;
&lt;H3&gt;&lt;CODE&gt;gen_ai.operation.name&lt;/CODE&gt;&lt;/H3&gt;
&lt;P&gt;The type of operation being performed. Values include &lt;CODE&gt;"chat"&lt;/CODE&gt; for standard LLM calls and &lt;CODE&gt;"execute_tool"&lt;/CODE&gt; for tool/function invocations. In our travel planner, when the Weather Advisor calls the &lt;CODE&gt;GetWeatherForecast&lt;/CODE&gt; function via NWS, or when the Currency Converter calls &lt;CODE&gt;ConvertCurrency&lt;/CODE&gt; via the Frankfurter API, those tool calls get their own spans with &lt;CODE&gt;gen_ai.operation.name = "execute_tool"&lt;/CODE&gt;. This lets you measure LLM think-time separately from tool execution time — a critical distinction for performance optimization.&lt;/P&gt;
&lt;H3&gt;&lt;CODE&gt;gen_ai.request.model&lt;/CODE&gt; / &lt;CODE&gt;gen_ai.response.model&lt;/CODE&gt;&lt;/H3&gt;
&lt;P&gt;The model used for the request and the model that actually served the response (these can differ when providers do model routing). In our case, both are &lt;CODE&gt;"gpt-4o"&lt;/CODE&gt; since that's what we deploy via Azure OpenAI. These attributes let you track model usage across agents, spot unexpected model assignments, and correlate performance changes with model updates.&lt;/P&gt;
&lt;H3&gt;&lt;CODE&gt;gen_ai.usage.input_tokens&lt;/CODE&gt; / &lt;CODE&gt;gen_ai.usage.output_tokens&lt;/CODE&gt;&lt;/H3&gt;
&lt;P&gt;Token consumption per LLM call. This is what powers the token usage visualizations in the Agents view. The Coordinator agent, which aggregates results from all five specialist agents, tends to have higher output token counts because it's synthesizing a full travel plan. The Currency Converter, which makes focused API calls, uses fewer tokens overall. These attributes let you answer the question "which agent is costing me the most?" — and more importantly, let you set alerts when token usage spikes unexpectedly.&lt;/P&gt;
&lt;H3&gt;&lt;CODE&gt;gen_ai.system&lt;/CODE&gt;&lt;/H3&gt;
&lt;P&gt;The AI system or provider. In our case, this is &lt;CODE&gt;"openai"&lt;/CODE&gt; (set by the Azure OpenAI client instrumentation). If you're using multiple AI providers — say, Azure OpenAI for planning and a local model for classification — this attribute lets you filter and compare.&lt;/P&gt;
&lt;P&gt;Together, these attributes create a rich, structured view of agent behavior that goes far beyond generic tracing. They're the reason Application Insights can render agent-specific dashboards with token breakdowns, latency distributions, and end-to-end workflow views. Without these conventions, all you'd see is opaque HTTP calls to an OpenAI endpoint.&lt;/P&gt;
&lt;BLOCKQUOTE&gt;&lt;STRONG&gt;💡 Key takeaway:&lt;/STRONG&gt; The GenAI semantic conventions are what transform generic distributed traces into &lt;EM&gt;agent-aware&lt;/EM&gt; observability. They're the bridge between your code and the Agents view. Any framework that emits these attributes — MAF, Semantic Kernel, LangChain — can light up this dashboard.&lt;/BLOCKQUOTE&gt;
&lt;H2&gt;Two Layers of OpenTelemetry Instrumentation&lt;/H2&gt;
&lt;P&gt;Our travel planner sample instruments at two distinct levels, each capturing different aspects of agent behavior. Let's look at both.&lt;/P&gt;
&lt;H3&gt;Layer 1: IChatClient-Level Instrumentation&lt;/H3&gt;
&lt;P&gt;The first layer instruments at the &lt;CODE&gt;IChatClient&lt;/CODE&gt; level using &lt;CODE&gt;Microsoft.Extensions.AI&lt;/CODE&gt;. This is where we wrap the Azure OpenAI chat client with OpenTelemetry:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;var client = new AzureOpenAIClient(azureOpenAIEndpoint, new DefaultAzureCredential());
// Wrap with OpenTelemetry to emit GenAI semantic convention spans
return client.GetChatClient(modelDeploymentName).AsIChatClient()
    .AsBuilder()
    .UseOpenTelemetry()
    .Build();&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;This single &lt;CODE&gt;.UseOpenTelemetry()&lt;/CODE&gt; call intercepts every LLM call and emits spans with:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;CODE&gt;gen_ai.system&lt;/CODE&gt; — the AI provider (e.g., &lt;CODE&gt;"openai"&lt;/CODE&gt;)&lt;/LI&gt;
&lt;LI&gt;&lt;CODE&gt;gen_ai.request.model&lt;/CODE&gt; / &lt;CODE&gt;gen_ai.response.model&lt;/CODE&gt; — which model was used&lt;/LI&gt;
&lt;LI&gt;&lt;CODE&gt;gen_ai.usage.input_tokens&lt;/CODE&gt; / &lt;CODE&gt;gen_ai.usage.output_tokens&lt;/CODE&gt; — token consumption per call&lt;/LI&gt;
&lt;LI&gt;&lt;CODE&gt;gen_ai.operation.name&lt;/CODE&gt; — the operation type (&lt;CODE&gt;"chat"&lt;/CODE&gt;)&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Think of this as the "LLM layer" — it captures &lt;EM&gt;what the model is doing&lt;/EM&gt; regardless of which agent called it. It's model-centric telemetry.&lt;/P&gt;
&lt;H3&gt;Layer 2: Agent-Level Instrumentation&lt;/H3&gt;
&lt;P&gt;The second layer instruments at the agent level using MAF 1.0 GA's built-in OpenTelemetry support. This happens in the &lt;CODE&gt;BaseAgent&lt;/CODE&gt; class that all our agents inherit from:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;Agent = new ChatClientAgent(
    chatClient,
    instructions: Instructions,
    name: AgentName,
    description: Description,
    tools: chatOptions.Tools?.ToList())
    .AsBuilder()
    .UseOpenTelemetry(sourceName: AgentName)
    .Build();&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;The &lt;CODE&gt;.UseOpenTelemetry(sourceName: AgentName)&lt;/CODE&gt; call on the MAF agent builder emits a different set of spans:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;CODE&gt;gen_ai.agent.name&lt;/CODE&gt; — the human-readable agent name (e.g., &lt;CODE&gt;"Weather &amp;amp; Packing Advisor"&lt;/CODE&gt;)&lt;/LI&gt;
&lt;LI&gt;&lt;CODE&gt;gen_ai.agent.description&lt;/CODE&gt; — what the agent does&lt;/LI&gt;
&lt;LI&gt;&lt;CODE&gt;gen_ai.agent.id&lt;/CODE&gt; — the unique agent identifier&lt;/LI&gt;
&lt;LI&gt;Agent invocation traces — spans that represent the full lifecycle of an agent call&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;This is the "agent layer" — it captures &lt;EM&gt;which agent is doing the work&lt;/EM&gt; and provides the identity information that powers the Agents view dropdown and per-agent filtering.&lt;/P&gt;
&lt;H3&gt;Why Both Layers?&lt;/H3&gt;
&lt;P&gt;When both layers are active, you get the richest possible telemetry. The agent-level spans nest around the LLM-level spans, creating a trace hierarchy that looks like:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;Agent: "Weather &amp;amp; Packing Advisor" (gen_ai.agent.name)
  └── chat (gen_ai.operation.name)
        ├── model: gpt-4o, input_tokens: 450, output_tokens: 120
        └── execute_tool: GetWeatherForecast
              └── chat (follow-up with tool results)
                    └── model: gpt-4o, input_tokens: 680, output_tokens: 350&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;There is a tradeoff: with both layers active, you may see some span duplication since both the &lt;CODE&gt;IChatClient&lt;/CODE&gt; wrapper and the MAF agent wrapper emit spans for the same underlying LLM call. If you find the telemetry too noisy, you can disable one layer:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Agent layer only&lt;/STRONG&gt; (remove &lt;CODE&gt;.UseOpenTelemetry()&lt;/CODE&gt; from the &lt;CODE&gt;IChatClient&lt;/CODE&gt;) — You get agent identity but lose per-call token breakdowns.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;IChatClient layer only&lt;/STRONG&gt; (remove &lt;CODE&gt;.UseOpenTelemetry()&lt;/CODE&gt; from the agent builder) — You get detailed LLM metrics but lose agent identity in the Agents view.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;For the fullest experience with the Agents (Preview) view, we recommend keeping both layers active. The official sample uses both, and the Agents view is designed to handle the overlapping spans gracefully.&lt;/P&gt;
&lt;BLOCKQUOTE&gt;&lt;STRONG&gt;📖 Docs:&lt;/STRONG&gt; &lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/agent-framework/agents/observability" target="_blank" rel="noopener"&gt;MAF Observability Guide&lt;/A&gt;&lt;/BLOCKQUOTE&gt;
&lt;H2&gt;Exporting Telemetry to Application Insights&lt;/H2&gt;
&lt;P&gt;Emitting OpenTelemetry spans is only useful if they land somewhere you can query them. The good news is that &lt;STRONG&gt;Azure App Service and Application Insights have deep native integration&lt;/STRONG&gt; — App Service can auto-instrument your app, forward platform logs, and surface health metrics out of the box. For a full overview of monitoring capabilities, see &lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/azure/app-service/monitor-app-service?tabs=aspnetcore" target="_blank" rel="noopener"&gt;Monitor Azure App Service&lt;/A&gt;.&lt;/P&gt;
&lt;P&gt;For our AI agent scenario, we go beyond the built-in platform telemetry. We need the GenAI semantic convention spans that we configured in the previous sections to flow into App Insights so the Agents (Preview) view can render them. Our travel planner has two host processes — the ASP.NET Core API and a WebJob — and each requires a slightly different exporter setup.&lt;/P&gt;
&lt;H3&gt;ASP.NET Core API — Azure Monitor OpenTelemetry Distro&lt;/H3&gt;
&lt;P&gt;For the API, it's a single line. The &lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/azure/azure-monitor/app/opentelemetry-enable?tabs=aspnetcore" target="_blank" rel="noopener"&gt;Azure Monitor OpenTelemetry Distro&lt;/A&gt; handles everything:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;// Configure OpenTelemetry with Azure Monitor for traces, metrics, and logs.
// The APPLICATIONINSIGHTS_CONNECTION_STRING env var is auto-discovered.
builder.Services.AddOpenTelemetry().UseAzureMonitor();&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;That's it. The distro automatically:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Discovers the &lt;CODE&gt;APPLICATIONINSIGHTS_CONNECTION_STRING&lt;/CODE&gt; environment variable&lt;/LI&gt;
&lt;LI&gt;Configures trace, metric, and log exporters to Application Insights&lt;/LI&gt;
&lt;LI&gt;Sets up appropriate sampling and batching&lt;/LI&gt;
&lt;LI&gt;Registers standard ASP.NET Core HTTP instrumentation&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;This is the recommended approach for any ASP.NET Core application. One NuGet package (&lt;CODE&gt;Azure.Monitor.OpenTelemetry.AspNetCore&lt;/CODE&gt;), one line of code, zero configuration files.&lt;/P&gt;
&lt;H3&gt;WebJob — Manual Exporter Setup&lt;/H3&gt;
&lt;P&gt;The WebJob is a non-ASP.NET Core host (it uses &lt;CODE&gt;Host.CreateApplicationBuilder&lt;/CODE&gt;), so the distro's convenience method isn't available. Instead, we configure the exporters explicitly:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;// Configure OpenTelemetry with Azure Monitor for the WebJob (non-ASP.NET Core host).
// The APPLICATIONINSIGHTS_CONNECTION_STRING env var is auto-discovered.
builder.Services.AddOpenTelemetry()
    .ConfigureResource(r =&amp;gt; r.AddService("TravelPlanner.WebJob"))
    .WithTracing(t =&amp;gt; t
        .AddSource("*")
        .AddAzureMonitorTraceExporter())
    .WithMetrics(m =&amp;gt; m
        .AddMeter("*")
        .AddAzureMonitorMetricExporter());

builder.Logging.AddOpenTelemetry(o =&amp;gt; o.AddAzureMonitorLogExporter());&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;A few things to note:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;CODE&gt;.AddSource("*")&lt;/CODE&gt; — Subscribes to &lt;EM&gt;all&lt;/EM&gt; trace sources, including the ones emitted by MAF's &lt;CODE&gt;.UseOpenTelemetry(sourceName: AgentName)&lt;/CODE&gt;. In production, you might narrow this to specific source names for performance.&lt;/LI&gt;
&lt;LI&gt;&lt;CODE&gt;.AddMeter("*")&lt;/CODE&gt; — Similarly captures all metrics, including the GenAI metrics emitted by the instrumentation layers.&lt;/LI&gt;
&lt;LI&gt;&lt;CODE&gt;.ConfigureResource(r =&amp;gt; r.AddService("TravelPlanner.WebJob"))&lt;/CODE&gt; — Tags all telemetry with the service name so you can distinguish API vs. WebJob telemetry in Application Insights.&lt;/LI&gt;
&lt;LI&gt;The connection string is still auto-discovered from the &lt;CODE&gt;APPLICATIONINSIGHTS_CONNECTION_STRING&lt;/CODE&gt; environment variable — no need to pass it explicitly.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;The key difference between these two approaches is ceremony, not capability. Both send the same GenAI spans to Application Insights; the Agents view works identically regardless of which exporter setup you use.&lt;/P&gt;
&lt;BLOCKQUOTE&gt;&lt;STRONG&gt;📖 Docs:&lt;/STRONG&gt; &lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/azure/azure-monitor/app/opentelemetry-enable?tabs=aspnetcore" target="_blank" rel="noopener"&gt;Azure Monitor OpenTelemetry Distro&lt;/A&gt;&lt;/BLOCKQUOTE&gt;
&lt;H2&gt;Infrastructure as Code — Provisioning the Monitoring Stack&lt;/H2&gt;
&lt;P&gt;The monitoring infrastructure is provisioned via Bicep modules alongside the rest of the application's Azure resources. Here's how it fits together.&lt;/P&gt;
&lt;H3&gt;Log Analytics Workspace&lt;/H3&gt;
&lt;P&gt;&lt;CODE&gt;infra/core/monitor/loganalytics.bicep&lt;/CODE&gt; creates the Log Analytics workspace that backs Application Insights:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;resource logAnalyticsWorkspace 'Microsoft.OperationalInsights/workspaces@2023-09-01' = {
  name: name
  location: location
  tags: tags
  properties: {
    sku: {
      name: 'PerGB2018'
    }
    retentionInDays: 30
  }
}&lt;/CODE&gt;&lt;/PRE&gt;
&lt;H3&gt;Application Insights&lt;/H3&gt;
&lt;P&gt;&lt;CODE&gt;infra/core/monitor/appinsights.bicep&lt;/CODE&gt; creates a workspace-based Application Insights resource connected to Log Analytics:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;resource appInsights 'Microsoft.Insights/components@2020-02-02' = {
  name: name
  location: location
  tags: tags
  kind: 'web'
  properties: {
    Application_Type: 'web'
    WorkspaceResourceId: logAnalyticsWorkspaceId
  }
}

output connectionString string = appInsights.properties.ConnectionString&lt;/CODE&gt;&lt;/PRE&gt;
&lt;H3&gt;Wiring It All Together&lt;/H3&gt;
&lt;P&gt;In &lt;CODE&gt;infra/main.bicep&lt;/CODE&gt;, the Application Insights connection string is passed as an app setting to the App Service:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;appSettings: {
  APPLICATIONINSIGHTS_CONNECTION_STRING: appInsights.outputs.connectionString
  // ... other app settings
}&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;This is the critical glue: when the app starts, the OpenTelemetry distro (or manual exporters) auto-discover this environment variable and start sending telemetry to your Application Insights resource. No connection strings in code, no configuration files — it's all infrastructure-driven.&lt;/P&gt;
&lt;P&gt;The same connection string is available to both the API and the WebJob since they run on the same App Service. All agent telemetry from both host processes flows into a single Application Insights resource, giving you a unified view across the entire application.&lt;/P&gt;
&lt;H2&gt;See It in Action&lt;/H2&gt;
&lt;P&gt;Once the application is deployed and processing travel plan requests, here's how to explore the agent telemetry in Application Insights.&lt;/P&gt;
&lt;H3&gt;Step 1: Open the Agents (Preview) View&lt;/H3&gt;
&lt;P&gt;In the Azure portal, navigate to your Application Insights resource. In the left nav, look for &lt;STRONG&gt;Agents (Preview)&lt;/STRONG&gt; under the Investigations section. This opens the unified agent monitoring dashboard.&lt;/P&gt;
&lt;img /&gt;&lt;!-- SCREENSHOT: Agents (Preview) view main dashboard with token usage tiles and operational metrics for the travel planner --&gt;
&lt;H3&gt;Step 2: Filter by Agent&lt;/H3&gt;
&lt;P&gt;The agent dropdown at the top of the page is populated by the &lt;CODE&gt;gen_ai.agent.name&lt;/CODE&gt; values in your telemetry. You'll see all six agents listed:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Travel Planning Coordinator&lt;/LI&gt;
&lt;LI&gt;Currency Conversion Specialist&lt;/LI&gt;
&lt;LI&gt;Weather &amp;amp; Packing Advisor&lt;/LI&gt;
&lt;LI&gt;Local Expert &amp;amp; Cultural Guide&lt;/LI&gt;
&lt;LI&gt;Itinerary Planning Expert&lt;/LI&gt;
&lt;LI&gt;Budget Optimization Specialist&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Select a specific agent to filter the entire dashboard — token usage, latency, error rate — down to that one agent.&lt;/P&gt;
&lt;img /&gt;&lt;!-- SCREENSHOT: Agent dropdown expanded showing all 6 agents listed by their gen_ai.agent.name values --&gt;
&lt;H3&gt;Step 3: Review Token Usage&lt;/H3&gt;
&lt;P&gt;The token usage tile shows total input and output token consumption over your selected time range. Compare agents to find your biggest spenders. In our testing, the Coordinator agent consistently uses the most output tokens because it aggregates and synthesizes results from all five specialists.&lt;/P&gt;
&lt;H3&gt;Step 4: Drill into Traces&lt;/H3&gt;
&lt;P&gt;Click &lt;STRONG&gt;"View Traces with Agent Runs"&lt;/STRONG&gt; to see all agent executions. Each row represents a workflow run. You can filter by time range, status (success/failure), and specific agent.&lt;/P&gt;
&lt;img /&gt;&lt;!-- SCREENSHOT: Search overlay showing agent traces filtered by a specific agent, with columns for timestamp, agent name, duration, and status --&gt;
&lt;H3&gt;Step 5: End-to-End Transaction Details&lt;/H3&gt;
&lt;P&gt;Click any trace to open the end-to-end transaction details. The &lt;STRONG&gt;"simple view"&lt;/STRONG&gt; renders the agent workflow as a story — showing each step, which agent handled it, how long it took, and what tools were called. For a full travel plan, you'll see the Coordinator dispatch work to each specialist, tool calls to the NWS weather API and Frankfurter currency API, and the final aggregation step.&lt;/P&gt;
&lt;!-- SCREENSHOT: End-to-end transaction details in "simple view" showing the complete agent workflow: Coordinator → Weather Advisor (with GetWeatherForecast tool call) → Currency Converter (with ConvertCurrency tool call) → Local Knowledge → Itinerary Planner → Budget Optimizer → Coordinator aggregation --&gt;
&lt;H2&gt;Grafana Dashboards&lt;/H2&gt;
&lt;P&gt;The Agents (Preview) view in Application Insights is great for ad-hoc investigation. For ongoing monitoring and alerting, Azure Managed Grafana provides prebuilt dashboards specifically designed for agent workloads.&lt;/P&gt;
&lt;P&gt;From the Agents view, click &lt;STRONG&gt;"Explore in Grafana"&lt;/STRONG&gt; to jump directly into these dashboards:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;&lt;A class="lia-external-url" href="https://aka.ms/amg/dash/af-agent" target="_blank" rel="noopener"&gt;Agent Framework Dashboard&lt;/A&gt;&lt;/STRONG&gt; — Per-agent metrics including token usage trends, latency percentiles, error rates, and throughput over time. Pin this to your operations wall.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;&lt;A class="lia-external-url" href="https://aka.ms/amg/dash/af-workflow" target="_blank" rel="noopener"&gt;Agent Framework Workflow Dashboard&lt;/A&gt;&lt;/STRONG&gt; — Workflow-level metrics showing how multi-agent orchestrations perform end-to-end. See how long complete travel plans take, identify bottleneck agents, and track success rates.&lt;/LI&gt;
&lt;/UL&gt;
&lt;img /&gt;&lt;!-- SCREENSHOT: Grafana Agent Framework dashboard showing token usage trends, latency distributions, and throughput charts for the travel planner agents --&gt;
&lt;P&gt;These dashboards query the same underlying data in Log Analytics, so there's zero additional instrumentation needed. If your telemetry lights up the Agents view, it lights up Grafana too.&lt;/P&gt;
&lt;H2&gt;Key Packages Summary&lt;/H2&gt;
&lt;P&gt;Here are the NuGet packages that make this work, pulled from the actual project files:&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Package&lt;/th&gt;&lt;th&gt;Version&lt;/th&gt;&lt;th&gt;Purpose&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;CODE&gt;Azure.Monitor.OpenTelemetry.AspNetCore&lt;/CODE&gt;&lt;/td&gt;&lt;td&gt;1.3.0&lt;/td&gt;&lt;td&gt;Azure Monitor OTEL Distro for ASP.NET Core (API). One-line setup for traces, metrics, and logs.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;CODE&gt;Azure.Monitor.OpenTelemetry.Exporter&lt;/CODE&gt;&lt;/td&gt;&lt;td&gt;1.3.0&lt;/td&gt;&lt;td&gt;Azure Monitor OTEL exporter for non-ASP.NET Core hosts (WebJob). Trace, metric, and log exporters.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;CODE&gt;Microsoft.Agents.AI&lt;/CODE&gt;&lt;/td&gt;&lt;td&gt;1.0.0&lt;/td&gt;&lt;td&gt;MAF 1.0 GA — &lt;CODE&gt;ChatClientAgent&lt;/CODE&gt;, &lt;CODE&gt;.UseOpenTelemetry()&lt;/CODE&gt; for agent-level instrumentation.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;CODE&gt;Microsoft.Extensions.AI&lt;/CODE&gt;&lt;/td&gt;&lt;td&gt;10.4.1&lt;/td&gt;&lt;td&gt;&lt;CODE&gt;IChatClient&lt;/CODE&gt; abstraction with &lt;CODE&gt;.UseOpenTelemetry()&lt;/CODE&gt; for LLM-level instrumentation.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;CODE&gt;OpenTelemetry.Extensions.Hosting&lt;/CODE&gt;&lt;/td&gt;&lt;td&gt;1.11.2&lt;/td&gt;&lt;td&gt;OTEL dependency injection integration for &lt;CODE&gt;Host.CreateApplicationBuilder&lt;/CODE&gt; (WebJob).&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;CODE&gt;Microsoft.Extensions.AI.OpenAI&lt;/CODE&gt;&lt;/td&gt;&lt;td&gt;10.4.1&lt;/td&gt;&lt;td&gt;OpenAI/Azure OpenAI adapter for &lt;CODE&gt;IChatClient&lt;/CODE&gt;. Bridges the Azure OpenAI SDK to the M.E.AI abstraction.&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 33.33%" /&gt;&lt;col style="width: 33.33%" /&gt;&lt;col style="width: 33.33%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;H2&gt;Wrapping Up&lt;/H2&gt;
&lt;P&gt;Let's zoom out. In this three-part series, so far we've gone from zero to a fully observable, production-grade multi-agent AI application on Azure App Service:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Blog 1&lt;/STRONG&gt; covered deploying the multi-agent travel planner with MAF 1.0 GA — the agents, the architecture, the infrastructure.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Blog 2&lt;/STRONG&gt; (this post) showed how to instrument those agents with OpenTelemetry, explained the GenAI semantic conventions that make agent-aware monitoring possible, and walked through the new Agents (Preview) view in Application Insights.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Blog 3&lt;/STRONG&gt;&amp;nbsp;will show you how to secure those agents for production with the Microsoft Agent Governance Toolkit.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;The pattern is straightforward:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;Add &lt;CODE&gt;.UseOpenTelemetry()&lt;/CODE&gt; at the &lt;CODE&gt;IChatClient&lt;/CODE&gt; level for LLM metrics.&lt;/LI&gt;
&lt;LI&gt;Add &lt;CODE&gt;.UseOpenTelemetry(sourceName: AgentName)&lt;/CODE&gt; at the MAF agent level for agent identity.&lt;/LI&gt;
&lt;LI&gt;Export to Application Insights via the Azure Monitor distro (one line) or manual exporters.&lt;/LI&gt;
&lt;LI&gt;Wire the connection string through Bicep and environment variables.&lt;/LI&gt;
&lt;LI&gt;Open the Agents (Preview) view and start monitoring.&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;With MAF 1.0 GA's built-in OpenTelemetry support and Application Insights' new Agents view, you get production-grade observability for AI agents with minimal code. The GenAI semantic conventions ensure your telemetry is structured, portable, and understood by any compliant backend. And because it's all standard OpenTelemetry, you're not locked into any single vendor — swap the exporter and your telemetry goes to Jaeger, Grafana, Datadog, or wherever you need it.&lt;/P&gt;
&lt;P&gt;Now go see what your agents are up to and check out &lt;A class="lia-internal-link lia-internal-url lia-internal-url-content-type-blog" href="https://techcommunity.microsoft.com/blog/appsonazureblog/govern-ai-agents-on-app-service-with-the-microsoft-agent-governance-toolkit/4510962" data-lia-auto-title="Blog 3" data-lia-auto-title-active="0" target="_blank"&gt;Blog 3&lt;/A&gt;.&lt;/P&gt;
&lt;H2&gt;Resources&lt;/H2&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Sample repository:&lt;/STRONG&gt; &lt;A class="lia-external-url" href="https://github.com/seligj95/app-service-multi-agent-maf-otel" target="_blank" rel="noopener"&gt;seligj95/app-service-multi-agent-maf-otel&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;App Insights Agents (Preview) view:&lt;/STRONG&gt; &lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/azure/azure-monitor/app/agents-view" target="_blank" rel="noopener"&gt;Documentation&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;GenAI Semantic Conventions:&lt;/STRONG&gt; &lt;A class="lia-external-url" href="https://opentelemetry.io/docs/specs/semconv/registry/attributes/gen-ai/" target="_blank" rel="noopener"&gt;OpenTelemetry GenAI Registry&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;MAF Observability Guide:&lt;/STRONG&gt; &lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/agent-framework/agents/observability" target="_blank" rel="noopener"&gt;Microsoft Agent Framework Observability&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Azure Monitor OpenTelemetry Distro:&lt;/STRONG&gt; &lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/azure/azure-monitor/app/opentelemetry-enable?tabs=aspnetcore" target="_blank" rel="noopener"&gt;Enable OpenTelemetry for .NET&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Grafana Agent Framework Dashboard:&lt;/STRONG&gt; &lt;A class="lia-external-url" href="https://aka.ms/amg/dash/af-agent" target="_blank" rel="noopener"&gt;aka.ms/amg/dash/af-agent&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Grafana Workflow Dashboard:&lt;/STRONG&gt; &lt;A class="lia-external-url" href="https://aka.ms/amg/dash/af-workflow" target="_blank" rel="noopener"&gt;aka.ms/amg/dash/af-workflow&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Blog 1:&lt;/STRONG&gt; &lt;A class="lia-internal-link lia-internal-url lia-internal-url-content-type-blog" href="https://techcommunity.microsoft.com/blog/appsonazureblog/build-multi-agent-ai-apps-on-azure-app-service-with-microsoft-agent-framework-1-/4510017" target="_blank" rel="noopener" data-lia-auto-title="Deploy Multi-Agent AI Apps on Azure App Service with MAF 1.0 GA" data-lia-auto-title-active="0"&gt;Deploy Multi-Agent AI Apps on Azure App Service with MAF 1.0 GA&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Blog 3: &lt;/STRONG&gt;&lt;A class="lia-internal-link lia-internal-url lia-internal-url-content-type-blog" href="https://techcommunity.microsoft.com/blog/appsonazureblog/govern-ai-agents-on-app-service-with-the-microsoft-agent-governance-toolkit/4510962" data-lia-auto-title="Govern AI Agents on App Service with the Microsoft Agent Governance Toolkit | Microsoft Community Hub" data-lia-auto-title-active="0" target="_blank"&gt;Govern AI Agents on App Service with the Microsoft Agent Governance Toolkit&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;</description>
      <pubDate>Tue, 14 Apr 2026 16:28:22 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/monitor-ai-agents-on-app-service-with-opentelemetry-and-the-new/ba-p/4510023</guid>
      <dc:creator>jordanselig</dc:creator>
      <dc:date>2026-04-14T16:28:22Z</dc:date>
    </item>
    <item>
      <title>Build Multi-Agent AI Apps on Azure App Service with Microsoft Agent Framework 1.0</title>
      <link>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/build-multi-agent-ai-apps-on-azure-app-service-with-microsoft/ba-p/4510017</link>
      <description>&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;STRONG&gt;Part 1 of 3 — Multi-Agent AI on Azure App Service&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;This is part 1 of a 3 part series on deploying and working with multi-agent AI on Azure App Service. Follow allong to learn how to deploy, manage, observe, and secure your agents on Azure App Service.&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P&gt;A couple of months ago, we published a &lt;A class="lia-internal-link lia-internal-url lia-internal-url-content-type-blog" href="https://techcommunity.microsoft.com/blog/appsonazureblog/part-3-client-side-multi-agent-orchestration-on-azure-app-service-with-microsoft/4466728" target="_blank" rel="noopener" data-lia-auto-title="three-part series" data-lia-auto-title-active="0"&gt;three-part series&lt;/A&gt; showing how to build multi-agent AI systems on Azure App Service using preview packages from the Microsoft Agent Framework (MAF) (formerly AutoGen / Semantic Kernel Agents). The series walked through async processing, the request-reply pattern, and client-side multi-agent orchestration — all running on App Service.&lt;/P&gt;
&lt;P&gt;Since then, &lt;STRONG&gt;Microsoft Agent Framework has reached 1.0 GA&lt;/STRONG&gt; — unifying AutoGen and Semantic Kernel into a single, production-ready agent platform. This post is a fresh start with the GA bits. We'll rebuild our travel-planner sample on the stable API surface, call out the breaking changes from preview, and get you up and running fast.&lt;/P&gt;
&lt;P&gt;All of the code is in the companion repo: &lt;A class="lia-external-url" href="https://github.com/seligj95/app-service-multi-agent-maf-otel" target="_blank" rel="noopener"&gt;seligj95/app-service-multi-agent-maf-otel&lt;/A&gt;.&lt;/P&gt;
&lt;H2&gt;What Changed in MAF 1.0 GA&lt;/H2&gt;
&lt;P&gt;The 1.0 release is more than a version bump. Here's what moved:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Unified platform.&lt;/STRONG&gt; AutoGen and Semantic Kernel agent capabilities have converged into &lt;CODE&gt;Microsoft.Agents.AI&lt;/CODE&gt;. One package, one API surface.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Stable APIs with long-term support.&lt;/STRONG&gt; The 1.0 contract is now locked for servicing. No more preview churn.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Breaking change — &lt;CODE&gt;Instructions&lt;/CODE&gt; on options removed.&lt;/STRONG&gt; In preview, you set instructions through &lt;CODE&gt;ChatClientAgentOptions.Instructions&lt;/CODE&gt;. In GA, pass them directly to the &lt;CODE&gt;ChatClientAgent&lt;/CODE&gt; constructor.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Breaking change — &lt;CODE&gt;RunAsync&lt;/CODE&gt; parameter rename.&lt;/STRONG&gt; The &lt;CODE&gt;thread&lt;/CODE&gt; parameter is now &lt;CODE&gt;session&lt;/CODE&gt; (type &lt;CODE&gt;AgentSession&lt;/CODE&gt;). If you were using named arguments, this is a compile error.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;&lt;CODE&gt;Microsoft.Extensions.AI&lt;/CODE&gt; upgraded.&lt;/STRONG&gt; The framework moved from the 9.x preview of &lt;CODE&gt;Microsoft.Extensions.AI&lt;/CODE&gt; to the stable &lt;STRONG&gt;10.4.1&lt;/STRONG&gt; release.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;OpenTelemetry integration built in.&lt;/STRONG&gt; The builder pipeline now includes &lt;CODE&gt;UseOpenTelemetry()&lt;/CODE&gt; out of the box — more on that in Blog 2.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Our project references reflect the GA stack:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;&amp;lt;PackageReference Include="Microsoft.Agents.AI" Version="1.0.0" /&amp;gt;
&amp;lt;PackageReference Include="Microsoft.Extensions.AI" Version="10.4.1" /&amp;gt;
&amp;lt;PackageReference Include="Azure.AI.OpenAI" Version="2.1.0" /&amp;gt;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;H2&gt;Why Azure App Service for AI Agents?&lt;/H2&gt;
&lt;P&gt;If you're building with Microsoft Agent Framework, you need somewhere to run your agents. You could reach for Kubernetes, containers, or serverless — but for most agent workloads, &lt;STRONG&gt;Azure App Service is the sweet spot&lt;/STRONG&gt;. Here's why:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;No infrastructure management&lt;/STRONG&gt; — App Service is fully managed. No clusters to configure, no container orchestration to learn. Deploy your .NET or Python agent code and it just runs.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Always On&lt;/STRONG&gt; — Agent workflows can take minutes. App Service's Always On feature (on Premium tiers) ensures your background workers never go cold, so agents are ready to process requests instantly.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;WebJobs for background processing&lt;/STRONG&gt; — Long-running agent workflows don't belong in HTTP request handlers. App Service's built-in WebJob support gives you a dedicated background worker that shares the same deployment, configuration, and managed identity — no separate compute resource needed.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Managed Identity everywhere&lt;/STRONG&gt; — Zero secrets in your code. App Service's system-assigned managed identity authenticates to Azure OpenAI, Service Bus, Cosmos DB, and Application Insights automatically. No connection strings, no API keys, no rotation headaches.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Built-in observability&lt;/STRONG&gt; — Native integration with Application Insights and OpenTelemetry means you can see exactly what your agents are doing in production (more on this in Part 2).&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Enterprise-ready&lt;/STRONG&gt; — VNet integration, deployment slots for safe rollouts, custom domains, auto-scaling rules, and built-in authentication. All the things you'll need when your agent POC becomes a production service.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Cost-effective&lt;/STRONG&gt; — A single P0v4 instance (~$75/month) hosts both your API and WebJob worker. Compare that to running separate container apps or a Kubernetes cluster for the same workload.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;The bottom line: App Service lets you focus on building your agents, not managing infrastructure. And since MAF supports both .NET and Python — both first-class citizens on App Service — you're covered regardless of your language preference.&lt;/P&gt;
&lt;H2&gt;Architecture Overview&lt;/H2&gt;
&lt;P&gt;The sample is a &lt;STRONG&gt;travel planner&lt;/STRONG&gt; that coordinates six specialized agents to build a personalized trip itinerary. Users fill out a form (destination, dates, budget, interests), and the system returns a comprehensive travel plan complete with weather forecasts, currency advice, a day-by-day itinerary, and a budget breakdown.&lt;/P&gt;
&lt;img /&gt;&lt;!-- SCREENSHOT: Architecture diagram showing the full system — User → Web UI → App Service API → Service Bus → WebJob → Multi-Agent Workflow → Azure OpenAI, with Cosmos DB for state. Use the Mermaid diagram from architecture.md or a polished version of it. --&gt;
&lt;H3&gt;The Six Agents&lt;/H3&gt;
&lt;OL&gt;
&lt;LI&gt;&lt;STRONG&gt;Currency Converter&lt;/STRONG&gt; — calls the &lt;A class="lia-external-url" href="https://www.frankfurter.dev/" target="_blank" rel="noopener"&gt;Frankfurter API&lt;/A&gt; for real-time exchange rates&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Weather Advisor&lt;/STRONG&gt; — calls the &lt;A class="lia-external-url" href="https://www.weather.gov/documentation/services-web-api" target="_blank" rel="noopener"&gt;National Weather Service API&lt;/A&gt; for forecasts and packing tips&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Local Knowledge Expert&lt;/STRONG&gt; — cultural insights, customs, and hidden gems&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Itinerary Planner&lt;/STRONG&gt; — day-by-day scheduling with timing and costs&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Budget Optimizer&lt;/STRONG&gt; — allocates spend across categories and suggests savings&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Coordinator&lt;/STRONG&gt; — assembles everything into a polished final plan&lt;/LI&gt;
&lt;/OL&gt;
&lt;H3&gt;Four-Phase Workflow&lt;/H3&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;th&gt;Phase&lt;/th&gt;&lt;th&gt;Agents&lt;/th&gt;&lt;th&gt;Execution&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;1 — Parallel Gathering&lt;/td&gt;&lt;td&gt;Currency, Weather, Local Knowledge&lt;/td&gt;&lt;td&gt;&lt;CODE&gt;Task.WhenAll&lt;/CODE&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;2 — Itinerary&lt;/td&gt;&lt;td&gt;Itinerary Planner&lt;/td&gt;&lt;td&gt;Sequential (uses Phase 1 context)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;3 — Budget&lt;/td&gt;&lt;td&gt;Budget Optimizer&lt;/td&gt;&lt;td&gt;Sequential (uses Phase 2 output)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;4 — Assembly&lt;/td&gt;&lt;td&gt;Coordinator&lt;/td&gt;&lt;td&gt;Final synthesis&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 33.33%" /&gt;&lt;col style="width: 33.33%" /&gt;&lt;col style="width: 33.33%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;H3&gt;Infrastructure&lt;/H3&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Azure App Service (P0v4)&lt;/STRONG&gt; — hosts the API and a continuous WebJob for background processing&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Azure Service Bus&lt;/STRONG&gt; — decouples the API from heavy AI work (async request-reply)&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Azure Cosmos DB&lt;/STRONG&gt; — stores task state, results, and per-agent chat histories (24-hour TTL)&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Azure OpenAI (GPT-4o)&lt;/STRONG&gt; — powers all agent LLM calls&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Application Insights + Log Analytics&lt;/STRONG&gt; — monitoring and diagnostics&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2&gt;ChatClientAgent Deep Dive&lt;/H2&gt;
&lt;P&gt;At the core of every agent is &lt;CODE&gt;ChatClientAgent&lt;/CODE&gt; from &lt;CODE&gt;Microsoft.Agents.AI&lt;/CODE&gt;. It wraps an &lt;CODE&gt;IChatClient&lt;/CODE&gt; (from &lt;CODE&gt;Microsoft.Extensions.AI&lt;/CODE&gt;) with instructions, a name, a description, and optionally a set of tools. This is &lt;STRONG&gt;client-side&lt;/STRONG&gt; orchestration — you control the chat history, lifecycle, and execution order. No server-side Foundry agent resources are created.&lt;/P&gt;
&lt;P&gt;Here's the &lt;CODE&gt;BaseAgent&lt;/CODE&gt; pattern used by all six agents in the sample:&lt;/P&gt;
&lt;!-- SCREENSHOT: The BaseAgent.cs file open in VS Code or Visual Studio, showing the full class with both constructors and the InvokeAsync method. --&gt;
&lt;PRE&gt;&lt;CODE&gt;// BaseAgent.cs — constructor for agents with tools
Agent = new ChatClientAgent(
    chatClient,
    instructions: Instructions,
    name: AgentName,
    description: Description,
    tools: chatOptions.Tools?.ToList())
    .AsBuilder()
    .UseOpenTelemetry(sourceName: AgentName)
    .Build();&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Notice the builder pipeline: &lt;CODE&gt;.AsBuilder().UseOpenTelemetry(...).Build()&lt;/CODE&gt;. This opts every agent into the framework's built-in OpenTelemetry instrumentation with a single line. We'll explore what that telemetry looks like in Blog 2.&lt;/P&gt;
&lt;P&gt;Invoking an agent is equally straightforward:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;// BaseAgent.cs — InvokeAsync
public async Task&amp;lt;ChatMessage&amp;gt; InvokeAsync(
    IList&amp;lt;ChatMessage&amp;gt; chatHistory,
    CancellationToken cancellationToken = default)
{
    var response = await Agent.RunAsync(
        chatHistory, session: null, options: null, cancellationToken);

    return response.Messages.LastOrDefault()
        ?? new ChatMessage(ChatRole.Assistant, "No response generated.");
}&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Key things to note:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;CODE&gt;session: null&lt;/CODE&gt; — this is the renamed parameter (was &lt;CODE&gt;thread&lt;/CODE&gt; in preview). We pass &lt;CODE&gt;null&lt;/CODE&gt; because we manage chat history ourselves.&lt;/LI&gt;
&lt;LI&gt;The agent receives the full &lt;CODE&gt;chatHistory&lt;/CODE&gt; list, so context accumulates across turns.&lt;/LI&gt;
&lt;LI&gt;Simple agents (Local Knowledge, Itinerary Planner, Budget Optimizer, Coordinator) use the tool-less constructor; agents that call external APIs (Currency, Weather) use the constructor that accepts &lt;CODE&gt;ChatOptions&lt;/CODE&gt; with tools.&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2&gt;Tool Integration&lt;/H2&gt;
&lt;P&gt;Two of our agents — &lt;STRONG&gt;Weather Advisor&lt;/STRONG&gt; and &lt;STRONG&gt;Currency Converter&lt;/STRONG&gt; — call real external APIs through the MAF tool-calling pipeline. Tools are registered using &lt;CODE&gt;AIFunctionFactory.Create()&lt;/CODE&gt; from &lt;CODE&gt;Microsoft.Extensions.AI&lt;/CODE&gt;.&lt;/P&gt;
&lt;P&gt;Here's how the &lt;CODE&gt;WeatherAdvisorAgent&lt;/CODE&gt; wires up its tool:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;// WeatherAdvisorAgent.cs
private static ChatOptions CreateChatOptions(
    IWeatherService weatherService, ILogger logger)
{
    var chatOptions = new ChatOptions
    {
        Tools = new List&amp;lt;AITool&amp;gt;
        {
            AIFunctionFactory.Create(
                GetWeatherForecastFunction(weatherService, logger))
        }
    };
    return chatOptions;
}&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&lt;CODE&gt;GetWeatherForecastFunction&lt;/CODE&gt; returns a &lt;CODE&gt;Func&amp;lt;double, double, int, Task&amp;lt;string&amp;gt;&amp;gt;&lt;/CODE&gt; that the model can call with latitude, longitude, and number of days. Under the hood, it hits the National Weather Service API and returns a formatted forecast string. The Currency Converter follows the same pattern with the Frankfurter API.&lt;/P&gt;
&lt;P&gt;This is one of the nicest parts of the GA API: you write a plain C# method, wrap it with &lt;CODE&gt;AIFunctionFactory.Create()&lt;/CODE&gt;, and the framework handles the JSON schema generation, function-call parsing, and response routing automatically.&lt;/P&gt;
&lt;H2&gt;Multi-Phase Workflow Orchestration&lt;/H2&gt;
&lt;P&gt;The &lt;CODE&gt;TravelPlanningWorkflow&lt;/CODE&gt; class coordinates all six agents. The key insight is that the orchestration is &lt;EM&gt;just C# code&lt;/EM&gt; — no YAML, no graph DSL, no special runtime. You decide when agents run, what context they receive, and how results flow between phases.&lt;/P&gt;
&lt;!-- SCREENSHOT: The TravelPlanningWorkflow.cs file showing Phase 1 (Task.WhenAll) and the beginning of Phase 2, highlighting the parallel-then-sequential pattern. --&gt;
&lt;PRE&gt;&lt;CODE&gt;// Phase 1: Parallel Information Gathering
var gatheringTasks = new[]
{
    GatherCurrencyInfoAsync(request, state, progress, cancellationToken),
    GatherWeatherInfoAsync(request, state, progress, cancellationToken),
    GatherLocalKnowledgeAsync(request, state, progress, cancellationToken)
};
await Task.WhenAll(gatheringTasks);&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;After Phase 1 completes, results are stored in a &lt;CODE&gt;WorkflowState&lt;/CODE&gt; object — a simple dictionary-backed container that holds per-agent chat histories and contextual data:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;// WorkflowState.cs
public Dictionary&amp;lt;string, object&amp;gt; Context { get; set; } = new();
public Dictionary&amp;lt;string, List&amp;lt;ChatMessage&amp;gt;&amp;gt; AgentChatHistories { get; set; } = new();&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Phases 2–4 run sequentially, each pulling context from the previous phase. For example, the Itinerary Planner receives weather and local knowledge gathered in Phase 1:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;var localKnowledge = state.GetFromContext&amp;lt;string&amp;gt;("LocalKnowledge") ?? "";
var weatherAdvice = state.GetFromContext&amp;lt;string&amp;gt;("WeatherAdvice") ?? "";

var itineraryChatHistory = state.GetChatHistory("ItineraryPlanner");
itineraryChatHistory.Add(new ChatMessage(ChatRole.User,
    $"Create a detailed {days}-day itinerary for {request.Destination}..."
    + $"\n\nWEATHER INFORMATION:\n{weatherAdvice}"
    + $"\n\nLOCAL KNOWLEDGE &amp;amp; TIPS:\n{localKnowledge}"));

var itineraryResponse = await _itineraryAgent.InvokeAsync(
    itineraryChatHistory, cancellationToken);&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;This pattern — parallel fan-out followed by sequential context enrichment — is simple, testable, and easy to extend. Need a seventh agent? Add it to the appropriate phase and wire it into &lt;CODE&gt;WorkflowState&lt;/CODE&gt;.&lt;/P&gt;
&lt;H2&gt;Async Request-Reply Pattern&lt;/H2&gt;
&lt;P&gt;A multi-agent workflow with six LLM calls (some with tool invocations) can easily run 30–60 seconds. That's well beyond typical HTTP timeout expectations and not a great user experience for a synchronous request. We use the &lt;STRONG&gt;Async Request-Reply pattern&lt;/STRONG&gt; to handle this:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;The API receives the travel plan request and immediately queues a message to &lt;STRONG&gt;Service Bus&lt;/STRONG&gt;.&lt;/LI&gt;
&lt;LI&gt;It stores an initial task record in &lt;STRONG&gt;Cosmos DB&lt;/STRONG&gt; with status &lt;CODE&gt;queued&lt;/CODE&gt; and returns a &lt;CODE&gt;taskId&lt;/CODE&gt; to the client.&lt;/LI&gt;
&lt;LI&gt;A &lt;STRONG&gt;continuous WebJob&lt;/STRONG&gt; (running as a separate process on the same App Service plan) picks up the message, executes the full multi-agent workflow, and writes the result back to Cosmos DB.&lt;/LI&gt;
&lt;LI&gt;The client polls the API for status updates until the task reaches &lt;CODE&gt;completed&lt;/CODE&gt;.&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;This pattern keeps the API responsive, makes the heavy work retriable (Service Bus handles retries and dead-lettering), and lets the WebJob run independently — you can restart it without affecting the API. We covered this pattern in detail in &lt;A class="lia-internal-link lia-internal-url lia-internal-url-content-type-blog" href="https://techcommunity.microsoft.com/blog/appsonazureblog/part-3-client-side-multi-agent-orchestration-on-azure-app-service-with-microsoft/4466728" target="_blank" rel="noopener" data-lia-auto-title="the previous series" data-lia-auto-title-active="0"&gt;the previous series&lt;/A&gt;, so we won't repeat the plumbing here.&lt;/P&gt;
&lt;H2&gt;Deploy with &lt;CODE&gt;azd&lt;/CODE&gt;&lt;/H2&gt;
&lt;P&gt;The repo is wired up with the Azure Developer CLI for one-command provisioning and deployment:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;git clone https://github.com/seligj95/app-service-multi-agent-maf-otel.git
cd app-service-multi-agent-maf-otel
azd auth login
azd up&lt;/CODE&gt;&lt;/PRE&gt;
&lt;!-- SCREENSHOT: Terminal output of a successful `azd up` showing the provisioned resources and the deployed endpoint URL. --&gt;
&lt;P&gt;&lt;CODE&gt;azd up&lt;/CODE&gt; provisions the following resources via Bicep:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Azure App Service (P0v4 Windows) with a continuous WebJob&lt;/LI&gt;
&lt;LI&gt;Azure Service Bus namespace and queue&lt;/LI&gt;
&lt;LI&gt;Azure Cosmos DB account, database, and containers&lt;/LI&gt;
&lt;LI&gt;Azure AI Services (Azure OpenAI with GPT-4o deployment)&lt;/LI&gt;
&lt;LI&gt;Application Insights and Log Analytics workspace&lt;/LI&gt;
&lt;LI&gt;Managed Identity with all necessary role assignments&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;After deployment completes, &lt;CODE&gt;azd&lt;/CODE&gt; outputs the App Service URL. Open it in your browser, fill in the travel form, and watch six agents collaborate on your trip plan in real time.&lt;/P&gt;
&lt;img /&gt;&lt;!-- SCREENSHOT: The travel planner web UI showing a completed travel plan with the progress bar at 100% and the formatted itinerary displayed below. --&gt;
&lt;H2&gt;What's Next&lt;/H2&gt;
&lt;P&gt;We now have a production-ready multi-agent app running on App Service with the GA Microsoft Agent Framework. But how do you actually &lt;EM&gt;observe&lt;/EM&gt; what these agents are doing? When six agents are making LLM calls, invoking tools, and passing context between phases — you need visibility into every step.&lt;/P&gt;
&lt;P&gt;In the &lt;A class="lia-internal-link lia-internal-url lia-internal-url-content-type-blog" href="https://techcommunity.microsoft.com/blog/appsonazureblog/monitor-ai-agents-on-app-service-with-opentelemetry-and-the-new-application-insi/4510023" data-lia-auto-title="next post" data-lia-auto-title-active="0" target="_blank"&gt;&lt;STRONG&gt;next post&lt;/STRONG&gt;&lt;/A&gt;, we'll dive deep into how we instrumented these agents with &lt;STRONG&gt;OpenTelemetry&lt;/STRONG&gt; and the new &lt;STRONG&gt;Agents (Preview)&lt;/STRONG&gt; view in &lt;STRONG&gt;Application Insights&lt;/STRONG&gt; — giving you full visibility into agent runs, token usage, tool calls, and model performance. You already saw the &lt;CODE&gt;.UseOpenTelemetry()&lt;/CODE&gt; call in the builder pipeline; Blog 2 shows what that telemetry looks like end to end and how to light up the new Agents experience in the Azure portal.&lt;/P&gt;
&lt;P&gt;Stay tuned!&lt;/P&gt;
&lt;H2&gt;Resources&lt;/H2&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;A class="lia-external-url" href="https://github.com/seligj95/app-service-multi-agent-maf-otel" target="_blank" rel="noopener"&gt;Sample repo — app-service-multi-agent-maf-otel&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A class="lia-external-url" href="https://devblogs.microsoft.com/semantic-kernel/microsoft-agent-framework-1-0-is-now-generally-available/" target="_blank" rel="noopener"&gt;Microsoft Agent Framework 1.0 GA Announcement&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/semantic-kernel/frameworks/agent/" target="_blank" rel="noopener"&gt;Microsoft Agent Framework Documentation&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A class="lia-internal-link lia-internal-url lia-internal-url-content-type-blog" href="https://techcommunity.microsoft.com/blog/appsonazureblog/part-3-client-side-multi-agent-orchestration-on-azure-app-service-with-microsoft/4466728" target="_blank" rel="noopener" data-lia-auto-title="Previous Series — Part 3: Client-Side Multi-Agent Orchestration on App Service" data-lia-auto-title-active="0"&gt;Previous Series — Part 3: Client-Side Multi-Agent Orchestration on App Service&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/dotnet/api/microsoft.extensions.ai" target="_blank" rel="noopener"&gt;Microsoft.Extensions.AI Documentation&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/azure/app-service/" target="_blank" rel="noopener"&gt;Azure App Service Documentation&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;P&gt;Blog 2: &lt;A class="lia-internal-link lia-internal-url lia-internal-url-content-type-blog" href="https://techcommunity.microsoft.com/blog/appsonazureblog/monitor-ai-agents-on-app-service-with-opentelemetry-and-the-new-application-insi/4510023" data-lia-auto-title="Monitor AI Agents on App Service with OpenTelemetry and the New Application Insights Agents View | Microsoft Community Hub" data-lia-auto-title-active="0" target="_blank"&gt;Monitor AI Agents on App Service with OpenTelemetry and the New Application Insights Agents View&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;/LI&gt;
&lt;LI&gt;
&lt;P&gt;Blog 3: &lt;A class="lia-internal-link lia-internal-url lia-internal-url-content-type-blog" href="https://techcommunity.microsoft.com/blog/appsonazureblog/govern-ai-agents-on-app-service-with-the-microsoft-agent-governance-toolkit/4510962" data-lia-auto-title="Govern AI Agents on App Service with the Microsoft Agent Governance Toolkit | Microsoft Community Hub" data-lia-auto-title-active="0" target="_blank"&gt;Govern AI Agents on App Service with the Microsoft Agent Governance Toolkit&lt;/A&gt;&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;</description>
      <pubDate>Tue, 14 Apr 2026 16:33:13 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/build-multi-agent-ai-apps-on-azure-app-service-with-microsoft/ba-p/4510017</guid>
      <dc:creator>jordanselig</dc:creator>
      <dc:date>2026-04-14T16:33:13Z</dc:date>
    </item>
    <item>
      <title>Deploying to Azure Web App from Azure DevOps Using UAMI</title>
      <link>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/deploying-to-azure-web-app-from-azure-devops-using-uami/ba-p/4509800</link>
      <description>&lt;P&gt;TOC&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;UAMI Configuration&lt;/LI&gt;
&lt;LI&gt;App Configuration&lt;/LI&gt;
&lt;LI&gt;Azure DevOps Configuration&lt;/LI&gt;
&lt;LI&gt;Logs&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;H2&gt;UAMI Configuration&lt;/H2&gt;
&lt;P&gt;Create a&amp;nbsp;&lt;STRONG&gt;User Assigned Managed Identity&lt;/STRONG&gt; with no additional configuration.&lt;BR /&gt;This identity will be mentioned in later steps, especially at Object ID.&lt;/P&gt;
&lt;img /&gt;
&lt;P class="lia-clear-both"&gt;&amp;nbsp;&lt;/P&gt;
&lt;H2&gt;App Configuration&lt;/H2&gt;
&lt;P&gt;On an existing&amp;nbsp;&lt;STRONG&gt;Azure Web App&lt;/STRONG&gt;, enable &lt;STRONG&gt;Diagnostic Settings&lt;/STRONG&gt; and configure it to retain certain types of logs, such as &lt;STRONG&gt;Access Audit Logs&lt;/STRONG&gt;.&lt;BR /&gt;These logs will be discussed in the final section of this article.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;Next, navigate to &lt;STRONG&gt;Access Control (IAM)&lt;/STRONG&gt; and assign the previously created &lt;STRONG&gt;User Assigned Managed Identity&lt;/STRONG&gt; the &lt;STRONG&gt;Website Contributor&lt;/STRONG&gt; role.&lt;/P&gt;
&lt;img /&gt;
&lt;P class="lia-clear-both"&gt;&amp;nbsp;&lt;/P&gt;
&lt;H2&gt;Azure DevOps Configuration&lt;/H2&gt;
&lt;P&gt;Go to&amp;nbsp;&lt;STRONG&gt;Azure DevOps → Project Settings → Service Connections&lt;/STRONG&gt;, and create a new &lt;STRONG&gt;ARM (Azure Resource Manager) connection&lt;/STRONG&gt;.&lt;/P&gt;
&lt;P&gt;While creating the connection:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Select the corresponding &lt;STRONG&gt;User Assigned Managed Identity&lt;/STRONG&gt;&lt;/LI&gt;
&lt;LI&gt;Grant it appropriate permissions at the &lt;STRONG&gt;Resource Group&lt;/STRONG&gt; level&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;During this process, you will be prompted to sign in again using your own account.&lt;BR /&gt;This authentication will later be reflected in the deployment logs discussed below.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Assuming the following deployment template is used in the pipeline, you will notice that &lt;STRONG&gt;additional steps appear in the deployment process&lt;/STRONG&gt; compared to traditional service principal–based authentication.&lt;/P&gt;
&lt;img /&gt;
&lt;P class="lia-clear-both"&gt;&amp;nbsp;&lt;/P&gt;
&lt;H2&gt;Logs&lt;/H2&gt;
&lt;P&gt;A few minutes after deployment, related log records will appear.&lt;/P&gt;
&lt;P&gt;In the &lt;STRONG&gt;AppServiceAuditLogs&lt;/STRONG&gt; table, you can observe that the &lt;STRONG&gt;deployment initiator&lt;/STRONG&gt; is shown as &lt;STRONG&gt;the Object ID from UAMI&lt;/STRONG&gt;, and the &lt;STRONG&gt;Source&lt;/STRONG&gt; is listed as &lt;STRONG&gt;Azure (DevOps)&lt;/STRONG&gt;.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;This indicates that the &lt;STRONG&gt;User Assigned Managed Identity is authorized under my user context&lt;/STRONG&gt;, while the deployment action itself is initiated by Azure DevOps.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 09 Apr 2026 05:53:29 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/deploying-to-azure-web-app-from-azure-devops-using-uami/ba-p/4509800</guid>
      <dc:creator>theringe</dc:creator>
      <dc:date>2026-04-09T05:53:29Z</dc:date>
    </item>
    <item>
      <title>Build and Host MCP Apps on Azure App Service</title>
      <link>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/build-and-host-mcp-apps-on-azure-app-service/ba-p/4509705</link>
      <description>&lt;P&gt;MCP Apps are here, and they're a game-changer for building AI tools with interactive UIs. If you've been following the Model Context Protocol (MCP) ecosystem, you've probably heard about the &lt;A class="lia-external-url" href="https://modelcontextprotocol.io/extensions/apps/overview" target="_blank"&gt;MCP Apps spec&lt;/A&gt; — the first official MCP extension that lets your tools return rich, interactive UIs that render directly inside AI chat clients like Claude Desktop, ChatGPT, VS Code Copilot, Goose, and Postman.&lt;/P&gt;
&lt;P&gt;And here's the best part: you can host them on Azure App Service. In this post, I'll walk you through building a weather widget MCP App and deploying it to App Service. You'll have a production-ready MCP server serving interactive UIs in under 10 minutes.&lt;/P&gt;
&lt;H3&gt;What Are MCP Apps?&lt;/H3&gt;
&lt;P&gt;MCP Apps extend the Model Context Protocol by combining &lt;STRONG&gt;tools&lt;/STRONG&gt; (the functions your AI client can call) with &lt;STRONG&gt;UI resources&lt;/STRONG&gt; (the interactive interfaces that display the results). The pattern is simple:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;A tool declares a &lt;CODE&gt;_meta.ui.resourceUri&lt;/CODE&gt; in its metadata&lt;/LI&gt;
&lt;LI&gt;When the tool is invoked, the MCP host fetches that UI resource&lt;/LI&gt;
&lt;LI&gt;The UI renders in a sandboxed iframe inside the chat client&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;The key insight? &lt;STRONG&gt;MCP Apps are just web apps&lt;/STRONG&gt; — HTML, JavaScript, and CSS served through MCP. And that's exactly what App Service does best.&lt;/P&gt;
&lt;P&gt;The MCP Apps spec supports cross-client rendering, so the same UI works in Claude Desktop, VS Code Copilot, ChatGPT, and other MCP-enabled clients. Your weather widget, map viewer, or data dashboard becomes a universal component in the AI ecosystem.&lt;/P&gt;
&lt;H3&gt;Why App Service for MCP Apps?&lt;/H3&gt;
&lt;P&gt;Azure App Service is a natural fit for hosting MCP Apps. Here's why:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Always On&lt;/STRONG&gt; — No cold starts. Your UI resources are served instantly, every time.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Easy Auth&lt;/STRONG&gt; — Secure your MCP endpoint with Entra ID authentication out of the box, no code required.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Custom domains + TLS&lt;/STRONG&gt; — Professional MCP server endpoints with your own domain and managed certificates.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Deployment slots&lt;/STRONG&gt; — Canary and staged rollouts for MCP App updates without downtime.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Sidecars&lt;/STRONG&gt; — Run backend services (Redis, message queues, monitoring agents) alongside your MCP server.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;App Insights&lt;/STRONG&gt; — Built-in telemetry to see which tools and UIs are being invoked, response times, and error rates.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Now, these are all capabilities you &lt;EM&gt;can&lt;/EM&gt; add to a production MCP App, but the sample we're building today keeps things simple. We're focusing on the core pattern: serving MCP tools with interactive UIs from App Service. The production features are there when you need them.&lt;/P&gt;
&lt;H3&gt;When to Use Functions vs App Service for MCP Apps&lt;/H3&gt;
&lt;P&gt;Before we dive into the code, let's talk about &lt;STRONG&gt;Azure Functions&lt;/STRONG&gt;. The Functions team has done great work with their &lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/azure/azure-functions/scenario-mcp-apps" target="_blank"&gt;MCP Apps quickstart&lt;/A&gt;, and if serverless is your preferred model, that's a fantastic option. Functions and App Service both host MCP Apps beautifully — they just serve different needs.&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="width: 100%; border-width: 1px;"&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;&amp;nbsp;&lt;/th&gt;&lt;th&gt;Azure Functions&lt;/th&gt;&lt;th&gt;Azure App Service&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Best for&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;New, purpose-built MCP Apps that benefit from serverless scaling&lt;/td&gt;&lt;td&gt;MCP Apps that need always-on hosting, persistent state, or are part of larger web apps&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Scaling&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Scale to zero, pay per invocation&lt;/td&gt;&lt;td&gt;Dedicated plans, always running&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Cold start&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Possible (mitigated by premium plan)&lt;/td&gt;&lt;td&gt;None (Always On)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Deployment&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;&lt;CODE&gt;azd up&lt;/CODE&gt; with Functions template&lt;/td&gt;&lt;td&gt;&lt;CODE&gt;azd up&lt;/CODE&gt; with App Service template&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;MCP Apps quickstart&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;&lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/azure/azure-functions/scenario-mcp-apps" target="_blank"&gt;Available&lt;/A&gt;&lt;/td&gt;&lt;td&gt;This blog post!&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Additional capabilities&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Event-driven triggers, durable functions&lt;/td&gt;&lt;td&gt;Easy Auth, custom domains, deployment slots, sidecars&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 33.33%" /&gt;&lt;col style="width: 33.33%" /&gt;&lt;col style="width: 33.33%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;P&gt;Think of it this way: if you're building a new MCP App from scratch and want serverless economics, go with Functions. If you're adding MCP capabilities to an existing web app, need zero cold starts, or want production features like Easy Auth and deployment slots, App Service is your friend.&lt;/P&gt;
&lt;H3&gt;Build the Weather Widget MCP App&lt;/H3&gt;
&lt;P&gt;Let's build a simple MCP App that fetches weather data from the Open-Meteo API and displays it in an interactive widget. The sample uses ASP.NET Core for the MCP server and Vite for the frontend UI.&lt;/P&gt;
&lt;P&gt;Here's the structure:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;app-service-mcp-app-sample/
├── src/
│   ├── Program.cs              # MCP server setup
│   ├── WeatherTool.cs          # Weather tool with UI metadata
│   ├── WeatherUIResource.cs    # MCP resource serving the UI
│   ├── WeatherService.cs       # Open-Meteo API integration
│   └── app/                    # Vite frontend (weather widget)
│       └── src/
│           └── weather-app.ts  # MCP Apps SDK integration
├── .vscode/
│   └── mcp.json                # VS Code MCP server config
├── azure.yaml                  # Azure Developer CLI config
└── infra/                      # Bicep infrastructure
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;H4&gt;Program.cs — MCP Server Setup&lt;/H4&gt;
&lt;P&gt;The MCP server is an ASP.NET Core app that registers tools and UI resources:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;using ModelContextProtocol;

var builder = WebApplication.CreateBuilder(args);

// Register WeatherService
builder.Services.AddSingleton&amp;lt;WeatherService&amp;gt;(sp =&amp;gt;
    new WeatherService(WeatherService.CreateDefaultClient()));

// Add MCP Server with HTTP transport, tools, and resources
builder.Services.AddMcpServer()
    .WithHttpTransport(t =&amp;gt; t.Stateless = true)
    .WithTools&amp;lt;WeatherTool&amp;gt;()
    .WithResources&amp;lt;WeatherUIResource&amp;gt;();

var app = builder.Build();

// Map MCP endpoints (no auth required for this sample)
app.MapMcp("/mcp").AllowAnonymous();

app.Run();
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&lt;CODE&gt;AddMcpServer()&lt;/CODE&gt; configures the MCP protocol handler. &lt;CODE&gt;WithHttpTransport()&lt;/CODE&gt; enables Streamable HTTP with stateless mode (no session management needed). &lt;CODE&gt;WithTools&amp;lt;WeatherTool&amp;gt;()&lt;/CODE&gt; registers our weather tool, and &lt;CODE&gt;WithResources&amp;lt;WeatherUIResource&amp;gt;()&lt;/CODE&gt; registers the UI resource that the MCP host will fetch and render. &lt;CODE&gt;MapMcp("/mcp")&lt;/CODE&gt; maps the MCP endpoint at &lt;CODE&gt;/mcp&lt;/CODE&gt;.&lt;/P&gt;
&lt;H4&gt;WeatherTool.cs — Tool with UI Metadata&lt;/H4&gt;
&lt;P&gt;The &lt;CODE&gt;WeatherTool&lt;/CODE&gt; class defines the tool and uses the &lt;CODE&gt;[McpMeta]&lt;/CODE&gt; attribute to declare a &lt;CODE&gt;ui&lt;/CODE&gt; metadata block containing the &lt;CODE&gt;resourceUri&lt;/CODE&gt;. This tells the MCP host where to fetch the interactive UI:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;using System.ComponentModel;
using ModelContextProtocol.Server;

[McpServerToolType]
public class WeatherTool
{
    private readonly WeatherService _weatherService;

    public WeatherTool(WeatherService weatherService)
    {
        _weatherService = weatherService;
    }

    [McpServerTool]
    [Description("Get current weather for a location via Open-Meteo. Returns weather data that displays in an interactive widget.")]
    [McpMeta("ui", JsonValue = """{"resourceUri": "ui://weather/index.html"}""")]
    public async Task&amp;lt;object&amp;gt; GetWeather(
        [Description("City name to check weather for (e.g., Seattle, New York, Miami)")]
        string location)
    {
        var result = await _weatherService.GetCurrentWeatherAsync(location);
        return result;
    }
}
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;The key line is the &lt;CODE&gt;[McpMeta("ui", ...)]&lt;/CODE&gt; attribute. This adds &lt;CODE&gt;_meta.ui.resourceUri&lt;/CODE&gt; to the tool definition, pointing to the &lt;CODE&gt;ui://weather/index.html&lt;/CODE&gt; resource. When the AI client calls this tool, the host fetches that resource and renders it in a sandboxed iframe alongside the tool result.&lt;/P&gt;
&lt;H4&gt;WeatherUIResource.cs — UI Resource&lt;/H4&gt;
&lt;P&gt;The UI resource class serves the bundled HTML as an MCP resource with the &lt;CODE&gt;ui://&lt;/CODE&gt; scheme and &lt;CODE&gt;text/html;profile=mcp-app&lt;/CODE&gt; MIME type required by the MCP Apps spec:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;using ModelContextProtocol.Protocol;
using ModelContextProtocol.Server;

[McpServerResourceType]
public class WeatherUIResource
{
    [McpServerResource(
        UriTemplate = "ui://weather/index.html",
        Name = "weather_ui",
        MimeType = "text/html;profile=mcp-app")]
    public static ResourceContents GetWeatherUI()
    {
        var filePath = Path.Combine(
            AppContext.BaseDirectory, "app", "dist", "index.html");
        var html = File.ReadAllText(filePath);

        return new TextResourceContents
        {
            Uri = "ui://weather/index.html",
            MimeType = "text/html;profile=mcp-app",
            Text = html
        };
    }
}
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;The &lt;CODE&gt;[McpServerResource]&lt;/CODE&gt; attribute registers this method as the handler for the &lt;CODE&gt;ui://weather/index.html&lt;/CODE&gt; resource. When the host fetches it, the bundled single-file HTML (built by Vite) is returned with the correct MIME type.&lt;/P&gt;
&lt;H4&gt;WeatherService.cs — Open-Meteo API Integration&lt;/H4&gt;
&lt;P&gt;The &lt;CODE&gt;WeatherService&lt;/CODE&gt; class handles geocoding and weather data from the &lt;A class="lia-external-url" href="https://open-meteo.com/" target="_blank"&gt;Open-Meteo API&lt;/A&gt;. Nothing MCP-specific here — it's just a standard HTTP client that geocodes a city name and fetches current weather observations.&lt;/P&gt;
&lt;H4&gt;The UI Resource (Vite Frontend)&lt;/H4&gt;
&lt;P&gt;The &lt;CODE&gt;app/&lt;/CODE&gt; directory contains a TypeScript app built with Vite that renders the weather widget. It uses the &lt;CODE&gt;@modelcontextprotocol/ext-apps&lt;/CODE&gt; SDK to communicate with the host:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;import { App } from "@modelcontextprotocol/ext-apps";

const app = new App({ name: "Weather Widget", version: "1.0.0" });

// Handle tool results from the server
app.ontoolresult = (params) =&amp;gt; {
  const data = parseToolResultContent(params.content);
  if (data) render(data);
};

// Adapt to host theme (light/dark)
app.onhostcontextchanged = (ctx) =&amp;gt; {
  if (ctx.theme) applyTheme(ctx.theme);
};

await app.connect();
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;The SDK's &lt;CODE&gt;App&lt;/CODE&gt; class handles the postMessage communication with the host. When the tool returns weather data, &lt;CODE&gt;ontoolresult&lt;/CODE&gt; fires and the widget renders the temperature, conditions, humidity, and wind. The app also adapts to the host's theme so it looks native in both light and dark mode.&lt;/P&gt;
&lt;P&gt;The frontend is bundled into a single &lt;CODE&gt;index.html&lt;/CODE&gt; file using Vite and the &lt;CODE&gt;vite-plugin-singlefile&lt;/CODE&gt; plugin, which inlines all JavaScript and CSS. This makes it easy to serve as a single MCP resource.&lt;/P&gt;
&lt;H3&gt;Run Locally&lt;/H3&gt;
&lt;P&gt;To run the sample locally, you'll need the &lt;A class="lia-external-url" href="https://dotnet.microsoft.com/download/dotnet/9.0" target="_blank"&gt;.NET 9 SDK&lt;/A&gt; and &lt;A class="lia-external-url" href="https://nodejs.org/" target="_blank"&gt;Node.js 18+&lt;/A&gt; installed. Clone the repo and run:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;# Clone the repo
git clone https://github.com/seligj95/app-service-mcp-app-sample.git
cd app-service-mcp-app-sample

# Build the frontend
cd src/app
npm install
npm run build

# Run the MCP server
cd ..
dotnet run
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;The server starts on &lt;CODE&gt;http://localhost:5000&lt;/CODE&gt;. Now connect from VS Code Copilot:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;Open your workspace in VS Code&lt;/LI&gt;
&lt;LI&gt;The sample includes a &lt;CODE&gt;.vscode/mcp.json&lt;/CODE&gt; that configures the local MCP server:
&lt;PRE&gt;&lt;CODE&gt;{
  "servers": {
    "local-mcp-appservice": {
      "type": "http",
      "url": "http://localhost:5000/mcp"
    }
  }
}&lt;/CODE&gt;&lt;/PRE&gt;
&lt;/LI&gt;
&lt;LI&gt;Open the GitHub Copilot Chat panel&lt;/LI&gt;
&lt;LI&gt;Ask: &lt;STRONG&gt;"What's the weather in Seattle?"&lt;/STRONG&gt;&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;Copilot will invoke the &lt;CODE&gt;GetWeather&lt;/CODE&gt; tool, and the interactive weather widget will render inline in the chat:&lt;/P&gt;
&lt;img&gt;
&lt;P&gt;Weather widget MCP App rendering inline in VS Code Copilot Chat&lt;/P&gt;
&lt;/img&gt;
&lt;P&gt;&lt;!-- TODO: Add screenshot of the MCP App weather widget rendering in VS Code Copilot Chat --&gt;&lt;/P&gt;
&lt;H3&gt;Deploy to Azure&lt;/H3&gt;
&lt;P&gt;Deploying to Azure is even easier. The sample includes an &lt;CODE&gt;azure.yaml&lt;/CODE&gt; file and Bicep templates for App Service, so you can deploy with a single command:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;cd app-service-mcp-app-sample
azd auth login
azd up
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&lt;CODE&gt;azd up&lt;/CODE&gt; will:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;Provision an App Service plan and web app in your subscription&lt;/LI&gt;
&lt;LI&gt;Build the .NET app and Vite frontend&lt;/LI&gt;
&lt;LI&gt;Deploy the app to App Service&lt;/LI&gt;
&lt;LI&gt;Output the public MCP endpoint URL&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;After deployment, &lt;CODE&gt;azd&lt;/CODE&gt; will output a URL like &lt;CODE&gt;https://app-abc123.azurewebsites.net&lt;/CODE&gt;. Update your &lt;CODE&gt;.vscode/mcp.json&lt;/CODE&gt; to point to the remote server:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;{
  "servers": {
    "remote-weather-app": {
      "type": "http",
      "url": "https://app-abc123.azurewebsites.net/mcp"
    }
  }
}&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;From that point forward, your MCP App is live. Any AI client that supports MCP Apps can invoke your weather tool and render the interactive widget — no local server required.&lt;/P&gt;
&lt;H3&gt;What's Next?&lt;/H3&gt;
&lt;P&gt;You've now built and deployed an MCP App to Azure App Service. Here's what you can explore next:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Read the &lt;A class="lia-external-url" href="https://modelcontextprotocol.io/extensions/apps/overview" target="_blank"&gt;MCP Apps spec&lt;/A&gt;&lt;/STRONG&gt; to understand the full capabilities of the extension, including input forms, persistent state, and multi-step workflows.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Check out the &lt;A class="lia-external-url" href="https://github.com/modelcontextprotocol/ext-apps/tree/main/examples" target="_blank"&gt;ext-apps examples&lt;/A&gt;&lt;/STRONG&gt; on GitHub — there are samples for map viewers, PDF renderers, system monitors, and more.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Try the &lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/azure/azure-functions/scenario-mcp-apps" target="_blank"&gt;Azure Functions MCP Apps quickstart&lt;/A&gt;&lt;/STRONG&gt; if you want to build a serverless MCP App.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Learn about &lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/azure/app-service/scenario-ai-model-context-protocol-server?tabs=dotnet" data-lia-auto-title="hosting remote MCP servers in App Service" data-lia-auto-title-active="0" target="_blank"&gt;hosting remote MCP servers in App Service&lt;/A&gt;&lt;/STRONG&gt; for more patterns and best practices.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Clone the &lt;A class="lia-external-url" href="https://github.com/seligj95/app-service-mcp-app-sample" target="_blank"&gt;sample repo&lt;/A&gt;&lt;/STRONG&gt; and customize it for your own use cases.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;And remember: App Service gives you a full production hosting platform for your MCP Apps. You can add Easy Auth to secure your endpoints with Entra ID, wire up App Insights for telemetry, configure custom domains and TLS certificates, and set up deployment slots for blue/green rollouts. These features make App Service a great choice when you're ready to take your MCP App to production.&lt;/P&gt;
&lt;P&gt;If you build something cool with MCP Apps and App Service, let me know — I'd love to see what you create!&lt;/P&gt;</description>
      <pubDate>Wed, 08 Apr 2026 18:10:03 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/build-and-host-mcp-apps-on-azure-app-service/ba-p/4509705</guid>
      <dc:creator>jordanselig</dc:creator>
      <dc:date>2026-04-08T18:10:03Z</dc:date>
    </item>
    <item>
      <title>3 Ways to Get More from Azure SRE Agent</title>
      <link>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/3-ways-to-get-more-from-azure-sre-agent/ba-p/4508993</link>
      <description>&lt;P&gt;When you first set up Azure SRE Agent, it’s tempting to give it everything. Connect all your alert sources, route every severity, set up scheduled tasks to poll your channels every 30 seconds. The agent can handle all of it.&lt;/P&gt;
&lt;P&gt;But a few simple configuration choices can help you get more value from every token the agent uses. Each investigation creates a conversation thread, and each thread consumes tokens. With the right setup, you can make sure the agent is spending those tokens on the work that has the highest impact.&lt;/P&gt;
&lt;P&gt;The pattern that works best: start focused, see results, and expand from there. Here are three ways to do that.&lt;/P&gt;
&lt;H2&gt;1. Start with the incidents that matter most&lt;/H2&gt;
&lt;P&gt;It's natural to want full coverage from day one. But in practice, starting narrow and expanding works better. When you route only high-severity or high-impact incidents to the agent first, you get to see the quality of its investigations on the work that matters most. Once you trust the output, expanding to broader coverage is a confident decision, not a leap of faith.&lt;/P&gt;
&lt;P&gt;The mechanism for this is your **incident response plan**. Instead of relying on a default handler that routes everything, create a targeted response plan with filters that match the incidents you want the agent to investigate.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Incident response plan filters: severity, title keywords, and exclusions.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Getting started:&lt;/STRONG&gt;&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Go to &lt;STRONG&gt;Response plan configuration&lt;/STRONG&gt; and create a new incident response plan.&lt;/LI&gt;
&lt;LI&gt;Set the &lt;STRONG&gt;Severity &lt;/STRONG&gt;filter. A good starting point is Sev0 through Sev2. These are the incidents where deep investigation has the highest impact.&lt;/LI&gt;
&lt;LI&gt;Use &lt;STRONG&gt;Title contains&lt;/STRONG&gt; to focus on specific incident patterns, or &lt;STRONG&gt;Title does not contain&lt;/STRONG&gt; to exclude known noisy alerts.&lt;/LI&gt;
&lt;LI&gt;Preview the filter results to see which past incidents would have matched.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;As you see results and get comfortable, widen the filters. Add Sev3. Remove title exclusions. Bring in more incident sources. The agent will handle the volume, and you'll know what the cost looks like because you've been watching it grow incrementally.&lt;/P&gt;
&lt;P&gt;If you already have an agent running with broad filters, it's worth reviewing your response plan. A quick check on your severity and title filters can make sure the agent is spending its time on the incidents you care about.&lt;/P&gt;
&lt;H2&gt;2. Replace high-frequency polling with smarter patterns&lt;/H2&gt;
&lt;P&gt;Scheduled tasks are one of the most powerful features of the agent, but they're also where cost can quietly balloon. The reason is simple: a scheduled task runs on a timer whether there's anything to find. An incident investigation fires once per incident. A task polling every 2 minutes fires 720 times a day, and most of those runs may find nothing new.&lt;/P&gt;
&lt;P&gt;High-frequency polling is generally a weak engineering pattern regardless of cost. It wastes compute, creates unnecessary load, and in the case of an AI agent, burns tokens checking for changes that haven't happened. Better patterns exist.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Prefer push over poll.&lt;/STRONG&gt; If the source system can send a signal (an alert, a webhook, a ticket), use that to trigger the agent. Push-based workflows fire only when something happens. This is cheaper and faster than polling.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;When polling is the right fit, batch it.&lt;/STRONG&gt; Instead of checking every 2 minutes, run a thorough check every hour. One consolidated report from 24 daily runs is more useful than 720 micro-checks that mostly say "nothing changed." The hourly report shows trends. The 2-minute poll shows snapshots.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;&lt;A class="lia-internal-link lia-internal-url lia-internal-url-content-type-blog" href="https://techcommunity.microsoft.com/blog/appsonazureblog/http-triggers-in-azure-sre-agent-from-jira-ticket-to-automated-investigation/4504960" target="_blank" rel="noopener" data-lia-auto-title="Consider HTTP triggers" data-lia-auto-title-active="0"&gt;Consider HTTP triggers&lt;/A&gt;.&lt;/STRONG&gt; If you have an external system that knows when work is needed (a deployment pipeline, a CI/CD tool, a monitoring platform), use an HTTP trigger to invoke the agent on demand. The agent only runs when there's actually something to do.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Match frequency to the operational cadence.&lt;/STRONG&gt; A Teams channel monitor works fine at 5-minute intervals. Humans don't type that fast. A health summary runs once a day. A shift-handoff report runs once per shift. Ask: how quickly do I actually need to detect this change? The answer is almost always slower than the timer you first set.&lt;/P&gt;
&lt;H2&gt;3. Keep threads fresh&lt;/H2&gt;
&lt;P&gt;Here's a detail that's easy to miss: every time a scheduled task runs, it adds to the same conversation thread. The agent reads the full thread history before responding. So a task that runs hourly accumulates 24 conversations a day in the same thread. After a week, the agent is reading through hundreds of prior exchanges before it even starts on the new work.&lt;/P&gt;
&lt;P&gt;The work stays the same. The cost per run keeps climbing. It's the equivalent of reopening a document and reading the entire thing from page one every time you want to add a sentence at the end.&lt;/P&gt;
&lt;P&gt;The fix is one setting. When creating or editing a scheduled task, set &lt;STRONG&gt;"Message grouping for updates"&lt;/STRONG&gt; to &lt;STRONG&gt;"New chat thread for each run."&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;That gives the agent a clean context on every execution. No accumulated history, no growing cost. One dropdown, predictable token usage on every run.&lt;/P&gt;
&lt;H2&gt;The pattern&lt;/H2&gt;
&lt;P&gt;Start small with incident routing, expand as you see results. Replace high-frequency polling with push signals, batching, and HTTP triggers. Keep scheduled task threads fresh with "New chat thread for each run."&lt;/P&gt;
&lt;P&gt;The agent is built to handle whatever you throw at it. These patterns just make sure you're getting the most value for what you spend.&lt;/P&gt;</description>
      <pubDate>Tue, 07 Apr 2026 23:33:10 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/3-ways-to-get-more-from-azure-sre-agent/ba-p/4508993</guid>
      <dc:creator>dchelupati</dc:creator>
      <dc:date>2026-04-07T23:33:10Z</dc:date>
    </item>
    <item>
      <title>Azure Monitor in Azure SRE Agent: Autonomous Alert Investigation and Intelligent Merging</title>
      <link>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/azure-monitor-in-azure-sre-agent-autonomous-alert-investigation/ba-p/4509069</link>
      <description>&lt;P data-line="2"&gt;Azure Monitor is great at telling you something is wrong. But once the alert fires, the real work begins — someone has to open the portal, triage it, dig into logs, and figure out what happened. That takes time. And while they're investigating, the same alert keeps firing every few minutes, stacking up duplicates of a problem that's already being looked at.&lt;/P&gt;
&lt;P data-line="4"&gt;This is exactly what Azure SRE Agent's Azure Monitor integration addresses. The agent picks up alerts as they fire, investigates autonomously, and remediates when it can — all without waiting for a human to get involved. And when that same alert fires again while the investigation is still underway, the agent merges it into the existing thread rather than creating a new one.&lt;/P&gt;
&lt;P data-line="6"&gt;In this blog, we'll walk through the full Azure Monitor experience in SRE Agent with a live AKS + Redis scenario — how alerts get picked up, what the agent does with them, how merging handles the noise, and why one often-overlooked setting (auto-resolve) makes a bigger difference than you'd expect.&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;H3 data-line="8"&gt;Key Takeaways&lt;/H3&gt;
&lt;OL data-line="10"&gt;
&lt;LI data-line="10"&gt;&lt;STRONG&gt;Set up Incident Response Plans to scope which alerts the agent handles&lt;/STRONG&gt; — filter by severity, title patterns, and resource type. Start with review mode, then promote to autonomous once you trust the agent's behavior for that failure pattern.&lt;/LI&gt;
&lt;LI data-line="11"&gt;&lt;STRONG&gt;Recurring alerts merge into one thread automatically&lt;/STRONG&gt;&amp;nbsp;— when the same alert rule fires repeatedly, the agent merges subsequent firings into the existing investigation instead of creating duplicates.&lt;/LI&gt;
&lt;LI data-line="12"&gt;&lt;STRONG&gt;Turn auto-resolve OFF for persistent failures&lt;/STRONG&gt;&amp;nbsp;(bad credentials, misconfigurations, resource exhaustion) so all firings merge into one thread.&amp;nbsp;&lt;STRONG&gt;Turn it ON for transient issues&lt;/STRONG&gt;&amp;nbsp;(traffic spikes, brief timeouts) so each gets a fresh investigation.&lt;/LI&gt;
&lt;LI data-line="13"&gt;&lt;STRONG&gt;Design alert rules around failure categories, not components&lt;/STRONG&gt;&amp;nbsp;— one alert rule = one investigation thread. Structure rules by symptom (Redis errors, HTTP errors, pod health) to give the agent focused, non-overlapping threads.&lt;/LI&gt;
&lt;LI data-line="14"&gt;&lt;STRONG&gt;Attach Custom Response Plans for specialized handling&lt;/STRONG&gt; — route specific alert patterns to custom-agents with custom instructions, tools, and runbooks.&lt;/LI&gt;
&lt;/OL&gt;
&lt;/BLOCKQUOTE&gt;
&lt;H2 data-line="10"&gt;It Starts with Any Azure Monitor Alert&lt;/H2&gt;
&lt;P data-line="12"&gt;Before we get to the demo, a quick note on what SRE Agent actually watches. The agent queries the&amp;nbsp;&lt;SPAN class="lia-text-color-21"&gt;&lt;A href="https://learn.microsoft.com/rest/api/monitor/alertsmanagement/alerts/get-all" target="_blank" rel="noopener" data-href="https://learn.microsoft.com/rest/api/monitor/alertsmanagement/alerts/get-all"&gt;Azure Alerts Management REST API&lt;/A&gt;&lt;/SPAN&gt;, which returns every fired alert regardless of signal type. Log search alerts, metric alerts, activity log alerts, smart detection, service health, Prometheus — all of them come through the same API, and the agent processes them all the same way. You don't need to configure connectors or webhooks per alert type. If it fires in Azure Monitor, the agent can see it.&lt;/P&gt;
&lt;P data-line="14"&gt;What you&amp;nbsp;&lt;EM&gt;do&lt;/EM&gt; need to configure is which alerts the agent should care about. That's where Incident Response Plans come in.&lt;/P&gt;
&lt;H2 data-line="16"&gt;Setting Up: Incident Response Plans and Alert Rules&lt;/H2&gt;
&lt;P data-line="18"&gt;We start by heading to &lt;STRONG&gt;Settings &amp;gt; Incident Platform &amp;gt; Azure Monitor&lt;/STRONG&gt; and creating an Incident Response Plan. Response Plans et you scope the agent's attention by severity, alert name patterns, target resource types, and — importantly — whether the agent should act autonomously or wait for human approval.&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P data-line="18"&gt;&lt;STRONG&gt;Action: Match the agent mode to your confidence in the remediation, not just the severity.&lt;/STRONG&gt;&amp;nbsp;Use&amp;nbsp;&lt;STRONG&gt;autonomous&lt;/STRONG&gt;&amp;nbsp;mode for well-understood failure patterns where the fix is predictable and safe (e.g., rolling back a bad config, restarting a pod). Use&amp;nbsp;&lt;STRONG&gt;review&lt;/STRONG&gt; mode for anything where you want a human to validate before the agent acts — especially Sev0/Sev1 alerts that touch critical systems. You can always start in review mode and promote to autonomous once you've validated the agent's behavior.&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;img&gt;&lt;STRONG&gt;Incident Response Plan configuration for Azure Monitor, showing severity, agent mode, and title pattern fields&lt;/STRONG&gt;&lt;/img&gt;
&lt;P&gt;For our demo, we created a Sev1 response plan in&amp;nbsp;&lt;STRONG&gt;autonomous mode&lt;/STRONG&gt; — meaning the agent would pick up any Sev1 alert and immediately start investigating and remediating, no approval needed.&lt;/P&gt;
&lt;P&gt;On the Azure Monitor side, we set up three log-based alert rules against our AKS cluster's Log Analytics workspace. The star of the show was a Redis connection error alert — a custom log search query looking for WRONGPASS, ECONNREFUSED, and other Redis failure signatures in ContainerLog:&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;Each rule evaluates every 5 minutes with a 15-minute aggregation window. If the query returns any results, the alert fires. Simple enough.&lt;/P&gt;
&lt;H2 data-line="28"&gt;Breaking Redis (On Purpose)&lt;/H2&gt;
&lt;P data-line="30"&gt;Our test app is a Node.js journal app on AKS, backed by Azure Cache for Redis. To create a realistic failure scenario, we updated the Redis password in the Kubernetes secret to a wrong value. The app pods picked up the bad credential, Redis connections started failing, and error logs started flowing.&lt;/P&gt;
&lt;P data-line="32"&gt;Within minutes, the Redis connection error alert fired.&lt;/P&gt;
&lt;H2 data-line="34"&gt;What Happened Next&lt;/H2&gt;
&lt;P data-line="36"&gt;Here's where it gets interesting. We didn't touch anything — we just watched.&lt;/P&gt;
&lt;P data-line="38"&gt;The agent's scanner polls the Azure Monitor Alerts API every 60 seconds. It spotted the new alert (state: "New", condition: "Fired"), matched it against our Sev1 Incident Response Plan, and immediately acknowledged it in Azure Monitor — flipping the state to "Acknowledged" so other systems and humans know someone's on it.&lt;/P&gt;
&lt;P data-line="40"&gt;Then it created a new investigation thread. The thread included everything the agent needed to get started: the alert ID, rule name, severity, description, affected resource, subscription, resource group, and a deep-link back to the Azure Portal alert.&lt;/P&gt;
&lt;P data-line="42"&gt;From there, the agent went to work autonomously. It queried container logs, identified the Redis&amp;nbsp;WRONGPASS&amp;nbsp;errors, traced them to the bad secret, retrieved the correct access key from Azure Cache for Redis, updated the Kubernetes secret, and triggered a pod rollout. By the time we checked the thread, it was already marked "Completed."&lt;/P&gt;
&lt;img /&gt;
&lt;P data-line="44"&gt;No pages. No human investigation. No context-switching.&lt;/P&gt;
&lt;H2 data-line="46"&gt;But the Alert Kept Firing...&lt;/H2&gt;
&lt;P data-line="48"&gt;Here's the thing — our alert rule evaluates every 5 minutes. Between the first firing and the agent completing the fix, the alert fired again. And again. Seven times total over 35 minutes.&lt;/P&gt;
&lt;P data-line="50"&gt;Without intelligent handling, that would mean seven separate investigation threads. Seven notifications. Seven disruptions.&lt;/P&gt;
&lt;P data-line="52"&gt;SRE Agent handles this with&amp;nbsp;&lt;STRONG&gt;alert merging&lt;/STRONG&gt;. When a subsequent firing comes in for the same alert rule, the agent checks: is there already an active thread for this rule, created within the last 7 days, that hasn't been resolved or closed? If yes, the new firing gets&amp;nbsp;&lt;STRONG&gt;silently merged&lt;/STRONG&gt;&amp;nbsp;into the existing thread — the total alert count goes up, the "Last fired" timestamp updates, and that's it. No new thread, no new notification, no interruption to the ongoing investigation.&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P data-line="64"&gt;&lt;STRONG&gt;How merging decides: new thread or merge?&lt;/STRONG&gt;&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table class="lia-border-style-solid" border="1" style="border-width: 1px;"&gt;&lt;thead&gt;&lt;tr&gt;&lt;th class="lia-align-center"&gt;Condition&lt;/th&gt;&lt;th class="lia-align-center"&gt;Result&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;Same alert rule, existing thread still active&lt;/td&gt;&lt;td&gt;&lt;STRONG&gt;Merged&lt;/STRONG&gt;&amp;nbsp;— alert count increments, no new thread&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Same alert rule, existing thread resolved/closed&lt;/td&gt;&lt;td&gt;&lt;STRONG&gt;New thread&lt;/STRONG&gt;&amp;nbsp;— fresh investigation starts&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Different alert rule&lt;/td&gt;&lt;td&gt;&lt;STRONG&gt;New thread&lt;/STRONG&gt; — always separate&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 50.00%" /&gt;&lt;col style="width: 50.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P data-line="54"&gt;Five minutes after the first alert, the second firing came in and that continued. The agent finished the fix and closed the thread, and the final tally was&amp;nbsp;&lt;STRONG&gt;one thread, seven merged alerts&lt;/STRONG&gt; — spanning 35 minutes of continuous firings.&lt;/P&gt;
&lt;P data-line="54"&gt;On the Azure Portal side, you can see all seven individual alert instances. Each one was acknowledged by the agent.&lt;/P&gt;
&lt;img&gt;
&lt;P&gt;&lt;EM&gt;&lt;STRONG&gt;7 Redis Connection Error Alert entries, all Sev1, Fired condition, Closed by user, spanning 8:50 PM to 9:21 PM&lt;/STRONG&gt;&lt;/EM&gt;&lt;/P&gt;
&lt;/img&gt;&lt;img /&gt;&lt;img /&gt;
&lt;P&gt;Seven firings. One investigation. One fix. That's the merge in action.&lt;/P&gt;
&lt;H2 data-line="74"&gt;The Auto-Resolve Twist&lt;/H2&gt;
&lt;P data-line="76"&gt;Now here's the part we didn't expect to matter as much as it did.&lt;/P&gt;
&lt;P data-line="78"&gt;Azure Monitor has a setting called&amp;nbsp;&lt;STRONG&gt;"Automatically resolve alerts"&lt;/STRONG&gt;. When enabled, Azure Monitor automatically transitions an alert to "Resolved" once the underlying condition clears — for example, when the Redis errors stop because the pod restarted.&lt;/P&gt;
&lt;P data-line="80"&gt;For our first scenario above, we had auto-resolve&amp;nbsp;&lt;STRONG&gt;turned off&lt;/STRONG&gt;. That's why the alert stayed in "Fired" state across all seven evaluation cycles, and all seven firings merged cleanly into one thread.&lt;/P&gt;
&lt;P data-line="84"&gt;But what happens if auto-resolve is on? We turned it on and ran the same scenario again:&lt;/P&gt;
&lt;img /&gt;
&lt;P data-line="90"&gt;Here's what happened:&lt;/P&gt;
&lt;OL data-line="92"&gt;
&lt;LI data-line="92"&gt;Redis broke. Alert fired. Agent picked it up and created a thread.&lt;/LI&gt;
&lt;LI data-line="93"&gt;The agent investigated, found the bad Redis password, fixed it.&lt;/LI&gt;
&lt;LI data-line="94"&gt;With Redis working again, error logs stopped. We noticed that the condition cleared and &lt;STRONG&gt;closed all the 7 alerts manually&lt;/STRONG&gt;.&lt;/LI&gt;
&lt;LI data-line="95"&gt;We broke Redis a second time (simulating a recurrence). The alert fired again — but the previous alert was already closed/resolved. The merge check found no active thread. &lt;STRONG&gt;A brand-new thread was created, reinvestigated and mitigated.&amp;nbsp;&lt;/STRONG&gt;&lt;/LI&gt;
&lt;/OL&gt;
&lt;P data-line="97"&gt;Two threads for the same alert rule, right there on the Incidents page:&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;And on the Azure Monitor side, the newest alert shows "Resolved" condition — that's the auto-resolve doing its thing:&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;For a persistent failure like a Redis misconfiguration, this is clearly worse. You get a new investigation thread every break-fix cycle instead of one continuous investigation.&lt;/P&gt;
&lt;H2 data-line="107"&gt;So, Should You Just Turn Auto-Resolve Off?&lt;/H2&gt;
&lt;P data-line="109"&gt;No. It depends on what kind of failure the alert is watching for.&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;H5 data-line="129"&gt;&lt;STRONG&gt;Quick Reference: Auto-Resolve Decision Guide&lt;/STRONG&gt;&lt;/H5&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;&amp;nbsp;&lt;/th&gt;&lt;th class="lia-align-center"&gt;&lt;STRONG&gt;Auto-Resolve OFF&lt;/STRONG&gt;&lt;/th&gt;&lt;th class="lia-align-center"&gt;&lt;STRONG&gt;Auto-Resolve ON&lt;/STRONG&gt;&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Use when&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Problem persists until fixed&lt;/td&gt;&lt;td&gt;Problem is transient and self-correcting&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Examples&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Bad credentials, misconfigurations, CrashLoopBackOff, connection pool exhaustion, IOPS limits&lt;/td&gt;&lt;td&gt;OOM kills during traffic spikes, brief latency from neighboring deployments, one-off job timeouts&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Merge behavior&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;All repeat firings merge into one thread&lt;/td&gt;&lt;td&gt;Each break-fix cycle creates a new thread&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Best for&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Agent is actively managing the alert lifecycle&lt;/td&gt;&lt;td&gt;Each occurrence may have a different root cause&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Tradeoff&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Alerts stay in "Fired/Acknowledged" state in Azure Monitor until the agent closes them&lt;/td&gt;&lt;td&gt;More threads, but each gets a clean investigation&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 33.33%" /&gt;&lt;col style="width: 33.33%" /&gt;&lt;col style="width: 33.33%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P data-line="111"&gt;&lt;STRONG&gt;Turn auto-resolve OFF&lt;/STRONG&gt; when you want repeated firings from the same alert rule to stay in a single investigation thread until the alert is explicitly resolved or closed in Azure Monitor. This works best for persistent issues such as a Kubernetes deployment stuck in CrashLoopBackOff because of a bad image tag, a database connection pool exhausted due to a leaked connection, or a storage account hitting its IOPS limit under sustained load.&lt;/P&gt;
&lt;P data-line="113"&gt;&lt;STRONG&gt;Turn auto-resolve ON&lt;/STRONG&gt; when you want a new investigation thread after the previous occurrence has been resolved or closed in Azure Monitor. This works best for episodic or self-clearing issues such as a pod getting OOM-killed during a temporary traffic spike, a brief latency increases during a neighboring service’s deployment, or a scheduled job that times out once due to short-lived resource contention.&lt;/P&gt;
&lt;P data-line="115"&gt;The key question is:&amp;nbsp;&lt;STRONG&gt;when this alert fires again, is it the same ongoing problem or a new one?&lt;/STRONG&gt;&amp;nbsp;If it's the same problem, turn auto-resolve off and let the merges do their job. If it's a new problem, leave auto-resolve on and let the agent investigate fresh.&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P data-line="115"&gt;&lt;STRONG data-start="2443" data-end="2452"&gt;Note:&lt;/STRONG&gt; These behaviors describe how SRE Agent groups alert investigations and may differ from how Azure Monitor documents native alert state behavior.&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;H2 data-line="117"&gt;A Few Things We Learned Along the Way&lt;/H2&gt;
&lt;P data-line="121"&gt;&lt;STRONG&gt;Design alert rules around symptoms, not components.&lt;/STRONG&gt;&amp;nbsp;Each alert rule maps to one investigation thread. We structured ours around failure categories — root cause signal (Redis errors, Sev1), blast radius signal (HTTP errors, Sev2), infrastructure signal (unhealthy pods, Sev2). This gave the agent focused threads without overlap.&lt;/P&gt;
&lt;P data-line="123"&gt;&lt;STRONG&gt;Incident Response Plans let you tier your response.&lt;/STRONG&gt;&amp;nbsp;Not every alert needs the agent to go fix things immediately. We used a Sev1 filter in autonomous mode for the Redis alert, but you could set up a Sev2 filter in&amp;nbsp;&lt;STRONG&gt;review mode&lt;/STRONG&gt; — the agent investigates and provides analysis but waits for human approval before taking action.&lt;/P&gt;
&lt;P data-line="125"&gt;&lt;STRONG&gt;Response Plans specialize the agent.&lt;/STRONG&gt; For specific alert patterns, you can give the agent custom instructions, specialized tools, and a tailored system prompt. A Redis alert can route to a custom-agent loaded with Redis-specific runbooks; a Kubernetes alert can route to one with deep kubectl expertise.&lt;/P&gt;
&lt;H2 data-line="143"&gt;Best Practices Checklist&lt;/H2&gt;
&lt;P data-line="145"&gt;Here's what we learned distilled into concrete actions:&lt;/P&gt;
&lt;H5 data-line="147"&gt;Alert Rule Design&lt;/H5&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Do&lt;/th&gt;&lt;th&gt;Don't&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;Design rules around&amp;nbsp;&lt;STRONG&gt;failure categories&lt;/STRONG&gt;&amp;nbsp;(root cause, blast radius, infra health)&lt;/td&gt;&lt;td&gt;Create one alert per component — you'll get overlapping threads&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Set&amp;nbsp;&lt;STRONG&gt;evaluation frequency&lt;/STRONG&gt;&amp;nbsp;and&amp;nbsp;&lt;STRONG&gt;aggregation window&lt;/STRONG&gt;&amp;nbsp;to match the failure pattern&lt;/td&gt;&lt;td&gt;Use the same frequency for everything — transient vs. persistent issues need different cadences&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 50.00%" /&gt;&lt;col style="width: 50.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;P data-line="155"&gt;&lt;STRONG&gt;Example rule structure from our test:&lt;/STRONG&gt;&lt;/P&gt;
&lt;UL data-line="156"&gt;
&lt;LI data-line="156"&gt;&lt;EM&gt;Root cause signal&lt;/EM&gt;&amp;nbsp;— Redis&amp;nbsp;WRONGPASS/ECONNREFUSED&amp;nbsp;errors →&amp;nbsp;&lt;STRONG&gt;Sev1&lt;/STRONG&gt;&lt;/LI&gt;
&lt;LI data-line="157"&gt;&lt;EM&gt;Blast radius signal&lt;/EM&gt;&amp;nbsp;— HTTP 5xx response codes →&amp;nbsp;&lt;STRONG&gt;Sev2&lt;/STRONG&gt;&lt;/LI&gt;
&lt;LI data-line="158"&gt;&lt;EM&gt;Infrastructure signal&lt;/EM&gt;&amp;nbsp;— KubeEvents&amp;nbsp;Reason="Unhealthy"&amp;nbsp;→&amp;nbsp;&lt;STRONG&gt;Sev2&lt;/STRONG&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;H5 data-line="160"&gt;Incident Response Plan Setup&lt;/H5&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Do&lt;/th&gt;&lt;th&gt;Don't&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;Create separate response plans per severity tier&lt;/td&gt;&lt;td&gt;Use one catch-all filter for everything&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Start with&amp;nbsp;&lt;STRONG&gt;review mode&lt;/STRONG&gt;&amp;nbsp;— especially for Sev0/Sev1 where wrong fixes are costly&lt;/td&gt;&lt;td&gt;Jump straight to autonomous mode on critical alerts without validating agent behavior first&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Promote to&amp;nbsp;&lt;STRONG&gt;autonomous mode&lt;/STRONG&gt;&amp;nbsp;once you've validated the agent handles a specific failure pattern correctly&lt;/td&gt;&lt;td&gt;Assume severity alone determines the right mode — it's about confidence in the remediation&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 50.00%" /&gt;&lt;col style="width: 50.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;H5 data-line="168"&gt;Response Plans&lt;/H5&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Do&lt;/th&gt;&lt;th&gt;Don't&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;Attach &lt;STRONG&gt;custom response plans&lt;/STRONG&gt;&amp;nbsp;to specific alert patterns for specialized handling&lt;/td&gt;&lt;td&gt;Leave every alert to the agent's general knowledge&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Include custom instructions, tools, and runbooks relevant to the failure type&lt;/td&gt;&lt;td&gt;Write generic instructions — the more specific, the better the investigation&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Route Redis alerts to a Redis-specialized custom-agent; K8s alerts to one with kubectl expertise&lt;/td&gt;&lt;td&gt;Assume one agent configuration fits all failure types&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 50.00%" /&gt;&lt;col style="width: 50.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;H2 data-line="127"&gt;Getting Started&lt;/H2&gt;
&lt;OL data-line="129"&gt;
&lt;LI data-line="178"&gt;Head to&amp;nbsp;&lt;A href="https://sre.azure.com/" target="_blank" rel="noopener" data-href="https://sre.azure.com"&gt;sre.azure.com&lt;/A&gt;&amp;nbsp;and open your agent&lt;/LI&gt;
&lt;LI data-line="179"&gt;Make sure the agent's managed identity has&amp;nbsp;&lt;STRONG&gt;Monitoring Reader&lt;/STRONG&gt;&amp;nbsp;on your target subscriptions&lt;/LI&gt;
&lt;LI data-line="180"&gt;Go to&amp;nbsp;&lt;STRONG&gt;Settings &amp;gt; Incident Platform &amp;gt; Azure Monitor&lt;/STRONG&gt; and create your Incident Response Plans&lt;/LI&gt;
&lt;LI data-line="181"&gt;&lt;STRONG&gt;Review the auto-resolve setting on your alert rules&lt;/STRONG&gt;&amp;nbsp;— turn it off for persistent issues, leave it on for transient ones (see the&amp;nbsp;&lt;SPAN class="lia-text-color-21"&gt;&lt;A href="https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/surivineela/sreagent-runtime-0406/src/Agent/Agent.Portal/Client/docs-website/blog/2026-04-07-azure-monitor-alert-merging-v2.md#so-should-you-just-turn-auto-resolve-off" target="_blank" rel="noopener" data-href="#so-should-you-just-turn-auto-resolve-off"&gt;decision guide above&lt;/A&gt;&lt;/SPAN&gt;)&lt;/LI&gt;
&lt;LI data-line="182"&gt;Start with a&amp;nbsp;&lt;STRONG&gt;test response plan &lt;/STRONG&gt;using&amp;nbsp;Title Contains&amp;nbsp;to target a specific alert rule — validate agent behavior before broadening&lt;/LI&gt;
&lt;LI data-line="183"&gt;Watch the&amp;nbsp;&lt;STRONG&gt;Incidents&lt;/STRONG&gt;&amp;nbsp;page and review the agent's investigation threads before expanding to more alert rules&lt;/LI&gt;
&lt;/OL&gt;
&lt;H2 data-line="135"&gt;Learn More&lt;/H2&gt;
&lt;UL data-line="137"&gt;
&lt;LI data-line="137"&gt;&lt;A href="https://sre.azure.com/docs" target="_blank" rel="noopener" data-href="https://sre.azure.com/docs"&gt;Azure SRE Agent Documentation&lt;/A&gt;&lt;/LI&gt;
&lt;LI data-line="138"&gt;&lt;A href="https://sre.azure.com/docs/incident-response" target="_blank" rel="noopener" data-href="https://sre.azure.com/docs/incident-response"&gt;Incident Response Guide&lt;/A&gt;&lt;/LI&gt;
&lt;LI data-line="139"&gt;&lt;A href="https://learn.microsoft.com/azure/azure-monitor/alerts/alerts-overview" target="_blank" rel="noopener" data-href="https://learn.microsoft.com/azure/azure-monitor/alerts/alerts-overview"&gt;Azure Monitor Alert Rules&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;</description>
      <pubDate>Tue, 07 Apr 2026 22:20:53 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/azure-monitor-in-azure-sre-agent-autonomous-alert-investigation/ba-p/4509069</guid>
      <dc:creator>Vineela-Suri</dc:creator>
      <dc:date>2026-04-07T22:20:53Z</dc:date>
    </item>
    <item>
      <title>Agentic IIS Migration to Managed Instance on Azure App Service</title>
      <link>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/agentic-iis-migration-to-managed-instance-on-azure-app-service/ba-p/4508969</link>
      <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;H2&gt;Introduction&lt;/H2&gt;
&lt;P&gt;Enterprises running ASP.NET Framework workloads on Windows Server with IIS face a familiar dilemma: modernize or stay put. The applications work, the infrastructure is stable, and nobody wants to be the person who breaks production during a cloud migration. But the cost of maintaining aging on-premises servers, patching Windows, and managing IIS keeps climbing.&lt;/P&gt;
&lt;P&gt;Azure App Service has long been the lift-and-shift destination for these workloads. But what about applications that depend on&amp;nbsp;&lt;STRONG&gt;Windows registry keys&lt;/STRONG&gt;,&amp;nbsp;&lt;STRONG&gt;COM components&lt;/STRONG&gt;,&amp;nbsp;&lt;STRONG&gt;SMTP relay&lt;/STRONG&gt;,&amp;nbsp;&lt;STRONG&gt;MSMQ queues&lt;/STRONG&gt;,&amp;nbsp;&lt;STRONG&gt;local file system access&lt;/STRONG&gt;, or&amp;nbsp;&lt;STRONG&gt;custom fonts&lt;/STRONG&gt;? These OS-level dependencies have historically been migration blockers — forcing teams into expensive re-architecture or keeping them anchored to VMs.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Managed Instance on Azure App Service&lt;/STRONG&gt;&amp;nbsp;changes this equation entirely. And the&amp;nbsp;&lt;STRONG&gt;IIS Migration MCP Server&lt;/STRONG&gt;&amp;nbsp;makes migration guided, intelligent, and safe — with AI agents that know what to ask, what to check, and what to generate at every step.&lt;/P&gt;
&lt;H2&gt;What Is Managed Instance on Azure App Service?&lt;/H2&gt;
&lt;P&gt;Managed Instance on App Service is Azure's answer to applications that need&amp;nbsp;&lt;STRONG&gt;OS-level customization&lt;/STRONG&gt;&amp;nbsp;beyond what standard App Service provides. It runs on the&amp;nbsp;&lt;STRONG&gt;PremiumV4 (PV4)&lt;/STRONG&gt;&amp;nbsp;SKU with&amp;nbsp;IsCustomMode=true, giving your app access to:&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Capability&lt;/th&gt;&lt;th&gt;What It Enables&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Registry Adapters&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Redirect Windows Registry reads to Azure Key Vault secrets — no code changes&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Storage Adapters&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Mount Azure Files, local SSD, or private VNET storage as drive letters (e.g.,&amp;nbsp;D:\,&amp;nbsp;E:\)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;install.ps1 Startup Script&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Run PowerShell at instance startup to install Windows features (SMTP, MSMQ), register COM components, install MSI packages, deploy custom fonts&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Custom Mode&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Full access to the Windows instance for configuration beyond standard PaaS guardrails&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 50.00%" /&gt;&lt;col style="width: 50.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;P&gt;&lt;STRONG&gt;The key constraint&lt;/STRONG&gt;: Managed Instance on App Service&amp;nbsp;&lt;STRONG&gt;requires PV4 SKU&lt;/STRONG&gt;&amp;nbsp;with&amp;nbsp;&lt;STRONG&gt;IsCustomMode=true&lt;/STRONG&gt;. No other SKU combination supports it.&lt;/P&gt;
&lt;H3&gt;Why Managed Instance Matters for Legacy Apps&lt;/H3&gt;
&lt;P&gt;Consider a classic enterprise ASP.NET application that:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Reads license keys from&amp;nbsp;HKLM\SOFTWARE\MyApp&amp;nbsp;in the Windows Registry&lt;/LI&gt;
&lt;LI&gt;Uses a COM component for PDF generation registered via&amp;nbsp;regsvr32&lt;/LI&gt;
&lt;LI&gt;Sends email through a local SMTP relay&lt;/LI&gt;
&lt;LI&gt;Writes reports to&amp;nbsp;D:\Reports\&amp;nbsp;on a local drive&lt;/LI&gt;
&lt;LI&gt;Uses a custom corporate font for PDF rendering&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;With standard App Service, you'd need to rewrite every one of these dependencies. With Managed Instance on App Service, you can:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Map registry reads to Key Vault secrets via&amp;nbsp;&lt;STRONG&gt;Registry Adapters&lt;/STRONG&gt;&lt;/LI&gt;
&lt;LI&gt;Mount Azure Files as&amp;nbsp;D:\&amp;nbsp;via&amp;nbsp;&lt;STRONG&gt;Storage Adapters&lt;/STRONG&gt;&lt;/LI&gt;
&lt;LI&gt;Enable SMTP Server via&amp;nbsp;&lt;STRONG&gt;install.ps1&lt;/STRONG&gt;&lt;/LI&gt;
&lt;LI&gt;Register the COM DLL via&amp;nbsp;&lt;STRONG&gt;install.ps1&lt;/STRONG&gt;&amp;nbsp;(regsvr32)&lt;/LI&gt;
&lt;LI&gt;Install the custom font via&amp;nbsp;&lt;STRONG&gt;install.ps1&lt;/STRONG&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Please note that when you are migrating your web applications to Managed Instance on Azure App Service in majority of the use cases "&lt;STRONG&gt;Zero application code changes may be required &lt;/STRONG&gt;" but depending on your specific web app some code changes may be necessary.&lt;/P&gt;
&lt;H3&gt;Microsoft Learn Resources&lt;/H3&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/en-us/azure/app-service/overview-managed-instance" target="_blank" rel="noopener"&gt;Managed Instance on App Service Overview&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/en-us/azure/app-service/" target="_blank" rel="noopener"&gt;Azure App Service Documentation&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/en-us/azure/app-service/app-service-migration-assistant" target="_blank" rel="noopener"&gt;App Service Migration Assistant Tool&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/en-us/azure/app-service/manage-move-across-regions" target="_blank" rel="noopener"&gt;Migrate to Azure App Service&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/en-us/azure/app-service/overview-hosting-plans" target="_blank" rel="noopener"&gt;Azure App Service Plans Overview&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/en-us/azure/app-service/app-service-configure-premium-tier" target="_blank" rel="noopener"&gt;PremiumV4 Pricing Tier&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/en-us/azure/key-vault/general/overview" target="_blank" rel="noopener"&gt;Azure Key Vault&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/en-us/azure/storage/files/storage-files-introduction" target="_blank" rel="noopener"&gt;Azure Files&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/en-us/azure/migrate/appcat/dotnet" target="_blank" rel="noopener"&gt;AppCat (.NET) — Azure Migrate Application and Code Assessment&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2&gt;Why Agentic Migration? The Case for AI-Guided IIS Migration&lt;/H2&gt;
&lt;H3&gt;The Problem with Traditional Migration&lt;/H3&gt;
&lt;P&gt;Microsoft provides excellent PowerShell scripts for IIS migration —&amp;nbsp;Get-SiteReadiness.ps1,&amp;nbsp;Get-SitePackage.ps1,&amp;nbsp;Generate-MigrationSettings.ps1, and&amp;nbsp;Invoke-SiteMigration.ps1. They're free, well-tested, and reliable. So why wrap them in an AI-powered system?&lt;/P&gt;
&lt;P&gt;Because&amp;nbsp;&lt;STRONG&gt;the scripts are powerful but not intelligent.&lt;/STRONG&gt;&amp;nbsp;They execute what you tell them to. They don't tell you&amp;nbsp;&lt;EM&gt;what&lt;/EM&gt;&amp;nbsp;to do.&lt;/P&gt;
&lt;P&gt;Here's what a traditional migration looks like:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;Run readiness checks — get a wall of JSON with cryptic check IDs like&amp;nbsp;ContentSizeCheck,&amp;nbsp;ConfigErrorCheck,&amp;nbsp;GACCheck&lt;/LI&gt;
&lt;LI&gt;Manually interpret 15+ readiness checks per site across dozens of sites&lt;/LI&gt;
&lt;LI&gt;Decide whether each site needs Managed Instance or standard App Service (how?)&lt;/LI&gt;
&lt;LI&gt;Figure out which dependencies need registry adapters vs. storage adapters vs. install.ps1 (the "Managed Instance provisioning split")&lt;/LI&gt;
&lt;LI&gt;Write the install.ps1 script by hand for each combination of OS features&lt;/LI&gt;
&lt;LI&gt;Author ARM templates for adapter configurations (Key Vault references, storage mount specs, RBAC assignments)&lt;/LI&gt;
&lt;LI&gt;Wire together&amp;nbsp;PackageResults.json&amp;nbsp;→&amp;nbsp;MigrationSettings.json&amp;nbsp;with correct Managed Instance fields (Tier=PremiumV4,&amp;nbsp;IsCustomMode=true)&lt;/LI&gt;
&lt;LI&gt;Hope you didn't misconfigure anything before deploying to Azure&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;Even experienced Azure engineers find this time-consuming, error-prone, and tedious — especially across a fleet of 20, 50, or 100+ IIS sites.&lt;/P&gt;
&lt;H3&gt;What Agentic Migration Changes&lt;/H3&gt;
&lt;P&gt;The IIS Migration MCP Server introduces an&amp;nbsp;&lt;STRONG&gt;AI orchestration layer&lt;/STRONG&gt;&amp;nbsp;that transforms this manual grind into a guided conversation:&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Traditional Approach&lt;/th&gt;&lt;th&gt;Agentic Approach&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;Read raw JSON output from scripts&lt;/td&gt;&lt;td&gt;AI summarizes readiness as tables with plain-English descriptions&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Memorize 15 check types and their severity&lt;/td&gt;&lt;td&gt;AI enriches each check with title, description, recommendation, and documentation links&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Manually decide Managed Instance vs App Service&lt;/td&gt;&lt;td&gt;recommend_target&amp;nbsp;analyzes all signals and recommends with confidence + reasoning&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Write install.ps1 from scratch&lt;/td&gt;&lt;td&gt;generate_install_script&amp;nbsp;builds it from detected features&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Author ARM templates manually&lt;/td&gt;&lt;td&gt;generate_adapter_arm_template&amp;nbsp;generates full templates with RBAC guidance&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Wire JSON artifacts between phases by hand&lt;/td&gt;&lt;td&gt;Agents pass&amp;nbsp;readiness_results_path&amp;nbsp;→&amp;nbsp;package_results_path&amp;nbsp;→&amp;nbsp;migration_settings_path&amp;nbsp;automatically&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Pray you set PV4 + IsCustomMode correctly&lt;/td&gt;&lt;td&gt;&lt;STRONG&gt;Enforced automatically&lt;/STRONG&gt;&amp;nbsp;— every tool validates Managed Instance constraints&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Deploy and find out what broke&lt;/td&gt;&lt;td&gt;confirm_migration&amp;nbsp;presents a full cost/resource summary before touching Azure&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 50.00%" /&gt;&lt;col style="width: 50.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;P&gt;&lt;STRONG&gt;The core value proposition: the AI knows the Managed Instance provisioning split.&lt;/STRONG&gt;&amp;nbsp;It knows that registry access needs an ARM template with Key Vault-backed adapters, while SMTP needs an&amp;nbsp;install.ps1&amp;nbsp;section enabling the Windows SMTP Server feature. You don't need to know this. The system detects it from your IIS configuration and AppCat analysis, then generates exactly the right artifacts.&lt;/P&gt;
&lt;H3&gt;Human-in-the-Loop Safety&lt;/H3&gt;
&lt;P&gt;Agentic doesn't mean autonomous. The system has explicit gates:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Phase 1 → Phase 2&lt;/STRONG&gt;: "Do you want to assess these sites, or skip to packaging?"&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Phase 3&lt;/STRONG&gt;: "Here's my recommendation — Managed Instance for Site A (COM + Registry), standard for Site B. Agree?"&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Phase 4&lt;/STRONG&gt;: "Review MigrationSettings.json before proceeding"&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Phase 5&lt;/STRONG&gt;: "This will create billable Azure resources. Type 'yes' to confirm"&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;The AI accelerates the workflow; the human retains control over every decision.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;H3&gt;Quick Start&lt;/H3&gt;
&lt;P&gt;&lt;SPAN class="lia-text-color-11"&gt;&lt;STRONG&gt;Clone and set up the MCP server git clone &lt;A href="https://github.com/&amp;lt;your-org&amp;gt;/iis-migration-mcp.git" target="_blank" rel="noopener"&gt;https://github.com//iis-migration-mcp.git&lt;/A&gt; &lt;/STRONG&gt;&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN class="lia-text-color-11"&gt;&lt;STRONG&gt;cd iis-migration-mcp python -m venv .venv .venv\Scripts\activate pip&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN class="lia-text-color-11"&gt;&lt;STRONG&gt;&amp;nbsp;install -r requirements.txt &lt;/STRONG&gt;&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN class="lia-text-color-11"&gt;&lt;STRONG&gt;# Download Microsoft's migration scripts (NOT included in this repo) # From: &lt;A href="https://appmigration.microsoft.com/api/download/psscripts/AppServiceMigrationScripts.zip" target="_blank" rel="noopener"&gt;https://appmigration.microsoft.com/api/download/psscripts/AppServiceMigrationScripts.zip&lt;/A&gt;&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN class="lia-text-color-11"&gt;&lt;STRONG&gt;# Unzip to C:\MigrationScripts (or your preferred path) &lt;/STRONG&gt;&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN class="lia-text-color-11"&gt;&lt;STRONG&gt;# Start using in VS Code with Copilot &lt;/STRONG&gt;&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN class="lia-text-color-11"&gt;&lt;STRONG&gt;# 1. Copy .vscode/mcp.json.example → .vscode/mcp.json &lt;/STRONG&gt;&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN class="lia-text-color-11"&gt;&lt;STRONG&gt;# 2. Open folder in VS Code &lt;/STRONG&gt;&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN class="lia-text-color-11"&gt;&lt;STRONG&gt;# 3. In Copilot Chat: "Configure scripts path to C:\MigrationScripts" &lt;/STRONG&gt;&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN class="lia-text-color-11"&gt;&lt;STRONG&gt;# 4. Then: @iis-migrate "Discover my IIS sites"&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;The server also works with&amp;nbsp;&lt;STRONG&gt;any MCP-compatible client&lt;/STRONG&gt; — Claude Desktop, Cursor, Copilot CLI, or custom integrations — via stdio transport.&lt;/P&gt;
&lt;H2&gt;Architecture: How the MCP Server Works&lt;/H2&gt;
&lt;P&gt;The system is built on the&amp;nbsp;&lt;STRONG&gt;Model Context Protocol (MCP)&lt;/STRONG&gt;, an open protocol that lets AI assistants like GitHub Copilot, Claude, or Cursor call external tools through a standardized interface.&lt;/P&gt;
&lt;P&gt;┌──────────────────────────────────────────────────────────────────┐&lt;/P&gt;
&lt;P&gt;│ VS Code + Copilot Chat │&lt;/P&gt;
&lt;P&gt;│ @iis-migrate orchestrator agent │&lt;/P&gt;
&lt;P&gt;│ ├── iis-discover (Phase 1) │&lt;/P&gt;
&lt;P&gt;│ ├── iis-assess (Phase 2) │&lt;/P&gt;
&lt;P&gt;│ ├── iis-recommend (Phase 3) │&lt;/P&gt;
&lt;P&gt;│ ├── iis-deploy-plan (Phase 4) │&lt;/P&gt;
&lt;P&gt;│ └── iis-execute (Phase 5) │&lt;/P&gt;
&lt;P&gt;└─────────────┬────────────────────────────────────────────────────┘&lt;/P&gt;
&lt;P&gt;│ stdio JSON-RPC (MCP Transport)&lt;/P&gt;
&lt;P&gt;▼&lt;/P&gt;
&lt;P&gt;┌──────────────────────────────────────────────────────────────────┐&lt;/P&gt;
&lt;P&gt;│ FastMCP Server (server.py) │&lt;/P&gt;
&lt;P&gt;│ 13 Python Tool Modules (tools/*.py) │&lt;/P&gt;
&lt;P&gt;│ └── ps_runner.py (Python → PowerShell bridge) │&lt;/P&gt;
&lt;P&gt;│ └── Downloaded PowerShell Scripts (user-configured) │&lt;/P&gt;
&lt;P&gt;│ ├── Local IIS (discovery, packaging) │&lt;/P&gt;
&lt;P&gt;│ └── Azure ARM API (deployment) │&lt;/P&gt;
&lt;P&gt;└──────────────────────────────────────────────────────────────────┘&lt;/P&gt;
&lt;P&gt;The server exposes&amp;nbsp;&lt;STRONG&gt;13 MCP tools&lt;/STRONG&gt;&amp;nbsp;organized across&amp;nbsp;&lt;STRONG&gt;5 phases&lt;/STRONG&gt;, orchestrated by&amp;nbsp;&lt;STRONG&gt;6 Copilot agents&lt;/STRONG&gt;&amp;nbsp;(1 orchestrator + 5 specialist subagents).&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Important:&lt;/STRONG&gt;&amp;nbsp;The PowerShell migration scripts are&amp;nbsp;&lt;STRONG&gt;not included&lt;/STRONG&gt;&amp;nbsp;in this repository. Users must download them from&amp;nbsp;&lt;A href="https://appmigration.microsoft.com/api/download/psscripts/AppServiceMigrationScripts.zip" target="_blank" rel="noopener"&gt;GitHub&lt;/A&gt;&amp;nbsp;and configure the path using the&amp;nbsp;configure_scripts_path&amp;nbsp;tool. This ensures you always use the latest version of Microsoft's scripts, avoiding version mismatch issues.&lt;/P&gt;
&lt;H2&gt;The 13 MCP Tools: Complete Reference&lt;/H2&gt;
&lt;H3&gt;Phase 0 — Setup&lt;/H3&gt;
&lt;H4&gt;configure_scripts_path&lt;/H4&gt;
&lt;P&gt;&lt;STRONG&gt;Purpose&lt;/STRONG&gt;: Point the server to Microsoft's downloaded migration PowerShell scripts.&lt;/P&gt;
&lt;P&gt;Before any migration work, you need to download the scripts from&amp;nbsp;&lt;A href="https://appmigration.microsoft.com/api/download/psscripts/AppServiceMigrationScripts.zip" target="_blank" rel="noopener"&gt;GitHub&lt;/A&gt;, unzip them, and tell the server where they are.&lt;/P&gt;
&lt;P&gt;"Configure scripts path to C:\MigrationScripts"&lt;/P&gt;
&lt;H3&gt;Phase 1 — Discovery&lt;/H3&gt;
&lt;H4&gt;1.&amp;nbsp;discover_iis_sites&lt;/H4&gt;
&lt;P&gt;&lt;STRONG&gt;Purpose&lt;/STRONG&gt;: Scan the local IIS server and run readiness checks on every web site.&lt;/P&gt;
&lt;P&gt;This is the entry point for every migration. It calls&amp;nbsp;Get-SiteReadiness.ps1&amp;nbsp;under the hood, which:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Enumerates all IIS web sites, application pools, bindings, and virtual directories&lt;/LI&gt;
&lt;LI&gt;Runs&amp;nbsp;&lt;STRONG&gt;15 readiness checks&lt;/STRONG&gt;&amp;nbsp;per site (config errors, HTTPS bindings, non-HTTP protocols, TCP ports, location tags, app pool settings, app pool identity, virtual directories, content size, global modules, ISAPI filters, authentication, framework version, connection strings, and more)&lt;/LI&gt;
&lt;LI&gt;Detects source code artifacts (.sln,&amp;nbsp;.csproj,&amp;nbsp;.cs,&amp;nbsp;.vb) near site physical paths&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;&lt;STRONG&gt;Output&lt;/STRONG&gt;:&amp;nbsp;ReadinessResults.json&amp;nbsp;with per-site status:&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Status&lt;/th&gt;&lt;th&gt;Meaning&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;READY&lt;/td&gt;&lt;td&gt;No issues detected — clear for migration&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;READY_WITH_WARNINGS&lt;/td&gt;&lt;td&gt;Minor issues that won't block migration&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;READY_WITH_ISSUES&lt;/td&gt;&lt;td&gt;Non-fatal issues that need attention&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;BLOCKED&lt;/td&gt;&lt;td&gt;Fatal issues (e.g., content &amp;gt; 2GB) — cannot migrate as-is&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 50.00%" /&gt;&lt;col style="width: 50.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;P&gt;&lt;STRONG&gt;Requires&lt;/STRONG&gt;: Administrator privileges, IIS installed.&lt;/P&gt;
&lt;H4&gt;2.&amp;nbsp;choose_assessment_mode&lt;/H4&gt;
&lt;P&gt;&lt;STRONG&gt;Purpose&lt;/STRONG&gt;: Route each discovered site into the appropriate next step.&lt;/P&gt;
&lt;P&gt;After discovery, you decide the path for each site:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;assess_all&lt;/STRONG&gt;: Run detailed assessment on all non-blocked sites&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;package_and_migrate&lt;/STRONG&gt;: Skip assessment, proceed directly to packaging (for sites you already know well)&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;The tool classifies each site into one of five actions:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;assess_config_only&amp;nbsp;— IIS/web.config analysis&lt;/LI&gt;
&lt;LI&gt;assess_config_and_source&amp;nbsp;— Config + AppCat source code analysis (when source is detected)&lt;/LI&gt;
&lt;LI&gt;package&amp;nbsp;— Skip to packaging&lt;/LI&gt;
&lt;LI&gt;blocked&amp;nbsp;— Fatal errors, cannot proceed&lt;/LI&gt;
&lt;LI&gt;skip&amp;nbsp;— User chose to exclude&lt;/LI&gt;
&lt;/UL&gt;
&lt;H3&gt;Phase 2 — Assessment&lt;/H3&gt;
&lt;H4&gt;3.&amp;nbsp;assess_site_readiness&lt;/H4&gt;
&lt;P&gt;&lt;STRONG&gt;Purpose&lt;/STRONG&gt;: Get a detailed, human-readable readiness assessment for a specific site.&lt;/P&gt;
&lt;P&gt;Takes the raw readiness data from Phase 1 and enriches each check with:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Title&lt;/STRONG&gt;: Plain-English name (e.g., "Global Assembly Cache (GAC) Dependencies")&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Description&lt;/STRONG&gt;: What the check found and why it matters&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Recommendation&lt;/STRONG&gt;: Specific guidance on how to resolve the issue&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Category&lt;/STRONG&gt;: Grouping (Configuration, Security, Compatibility)&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Documentation Link&lt;/STRONG&gt;: Microsoft Learn URL for further reading&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;This enrichment comes from&amp;nbsp;WebAppCheckResources.resx, an XML resource file that maps check IDs to detailed metadata. Without this tool, you'd see&amp;nbsp;GACCheck: FAIL&amp;nbsp;— with it, you see the full context.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Output&lt;/STRONG&gt;: Overall status, enriched failed/warning checks, framework version, pipeline mode, binding details.&lt;/P&gt;
&lt;H4&gt;4.&amp;nbsp;assess_source_code&lt;/H4&gt;
&lt;P&gt;&lt;STRONG&gt;Purpose&lt;/STRONG&gt;: Analyze an&amp;nbsp;&lt;A href="https://learn.microsoft.com/en-us/azure/migrate/appcat/dotnet" target="_blank" rel="noopener"&gt;Azure Migrate application and code assessment for .NET&lt;/A&gt;&amp;nbsp;JSON report to identify Managed Instance-relevant source code dependencies.&lt;/P&gt;
&lt;P&gt;If your application has source code and you've run the assessment tool against it, this tool parses the results and maps findings to migration actions:&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Dependency Detected&lt;/th&gt;&lt;th&gt;Migration Action&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;Windows Registry access&lt;/td&gt;&lt;td&gt;Registry Adapter (ARM template)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Local file system I/O / hardcoded paths&lt;/td&gt;&lt;td&gt;Storage Adapter (ARM template)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;SMTP usage&lt;/td&gt;&lt;td&gt;install.ps1 (SMTP Server feature)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;COM Interop&lt;/td&gt;&lt;td&gt;install.ps1 (regsvr32/RegAsm)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Global Assembly Cache (GAC)&lt;/td&gt;&lt;td&gt;install.ps1 (GAC install)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Message Queuing (MSMQ)&lt;/td&gt;&lt;td&gt;install.ps1 (MSMQ feature)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Certificate access&lt;/td&gt;&lt;td&gt;Key Vault integration&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 50.00%" /&gt;&lt;col style="width: 50.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;P&gt;The tool matches rules from the assessment output against known Managed Instance-relevant patterns. For a complete list of rules and categories, see&amp;nbsp;&lt;A href="https://learn.microsoft.com/en-us/dotnet/azure/migration/appcat/interpret-results" target="_blank" rel="noopener"&gt;Interpret the analysis results&lt;/A&gt;.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Output&lt;/STRONG&gt;: Issues categorized as mandatory/optional/potential, plus install_script_features and adapter_features lists that feed directly into Phase 3 tools.&lt;/P&gt;
&lt;H3&gt;Phase 3 — Recommendation &amp;amp; Provisioning&lt;/H3&gt;
&lt;H4&gt;5.&amp;nbsp;suggest_migration_approach&lt;/H4&gt;
&lt;P&gt;&lt;STRONG&gt;Purpose&lt;/STRONG&gt;: Recommend the right migration tool/approach for the scenario.&lt;/P&gt;
&lt;P&gt;This is a routing tool that considers:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Source code available?&lt;/STRONG&gt;&amp;nbsp;→ Recommend the App Modernization MCP server for code-level changes&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;No source code?&lt;/STRONG&gt;&amp;nbsp;→ Recommend this IIS Migration MCP (lift-and-shift)&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;OS customization needed?&lt;/STRONG&gt;&amp;nbsp;→ Highlight Managed Instance on App Service as the target&lt;/LI&gt;
&lt;/UL&gt;
&lt;H4&gt;6.&amp;nbsp;recommend_target&lt;/H4&gt;
&lt;P&gt;&lt;STRONG&gt;Purpose&lt;/STRONG&gt;: Recommend the Azure deployment target for each site based on all assessment data.&lt;/P&gt;
&lt;P&gt;This is the intelligence center of the system. It analyzes config assessments and source code findings to recommend:&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Target&lt;/th&gt;&lt;th&gt;When Recommended&lt;/th&gt;&lt;th&gt;SKU&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;MI_AppService&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Registry, COM, MSMQ, SMTP, local file I/O, GAC, or Windows Service dependencies detected&lt;/td&gt;&lt;td&gt;PremiumV4 (PV4)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;AppService&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Standard web app, no OS-level dependencies&lt;/td&gt;&lt;td&gt;PremiumV2 (PV2)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;ContainerApps&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Microservices architecture or container-first preference&lt;/td&gt;&lt;td&gt;N/A&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 33.33%" /&gt;&lt;col style="width: 33.33%" /&gt;&lt;col style="width: 33.33%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;P&gt;Each recommendation comes with:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Confidence&lt;/STRONG&gt;:&amp;nbsp;high&amp;nbsp;or&amp;nbsp;medium&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Reasoning&lt;/STRONG&gt;: Full explanation of why this target was chosen&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Managed Instance reasons&lt;/STRONG&gt;: Specific dependencies that require Managed Instance&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Blockers&lt;/STRONG&gt;: Issues that prevent migration entirely&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;install_script_features&lt;/STRONG&gt;: What the install.ps1 needs to enable&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;adapter_features&lt;/STRONG&gt;: What the ARM template needs to configure&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Provisioning guidance&lt;/STRONG&gt;: Step-by-step instructions for what to do next&lt;/LI&gt;
&lt;/UL&gt;
&lt;H4&gt;7.&amp;nbsp;generate_install_script&lt;/H4&gt;
&lt;P&gt;&lt;STRONG&gt;Purpose&lt;/STRONG&gt;: Generate an&amp;nbsp;install.ps1&amp;nbsp;PowerShell script for OS-level feature enablement on Managed Instance.&lt;/P&gt;
&lt;P&gt;This handles the&amp;nbsp;&lt;STRONG&gt;OS-level&lt;/STRONG&gt;&amp;nbsp;side of the Managed Instance provisioning split. It generates a startup script that includes sections for:&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Feature&lt;/th&gt;&lt;th&gt;What the Script Does&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;SMTP&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Install-WindowsFeature SMTP-Server, configure smart host relay&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;MSMQ&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Install MSMQ, create application queues&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;COM/MSI&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Run&amp;nbsp;msiexec&amp;nbsp;for MSI installers,&amp;nbsp;regsvr32/RegAsm&amp;nbsp;for COM registration&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Crystal Reports&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Install SAP Crystal Reports runtime MSI&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Custom Fonts&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Copy&amp;nbsp;.ttf/.otf&amp;nbsp;to&amp;nbsp;C:\Windows\Fonts, register in registry&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 50.00%" /&gt;&lt;col style="width: 50.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;P&gt;The script can auto-detect needed features from config and source assessments, or you can specify them manually.&lt;/P&gt;
&lt;H4&gt;8.&amp;nbsp;generate_adapter_arm_template&lt;/H4&gt;
&lt;P&gt;&lt;STRONG&gt;Purpose&lt;/STRONG&gt;: Generate an ARM template for Managed Instance registry and storage adapters.&lt;/P&gt;
&lt;P&gt;This handles the&amp;nbsp;&lt;STRONG&gt;platform-level&lt;/STRONG&gt;&amp;nbsp;side of the Managed Instance provisioning split. It generates a deployable ARM template that configures:&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Registry Adapters&lt;/STRONG&gt;&amp;nbsp;(Key Vault-backed):&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Map Windows Registry paths (e.g.,&amp;nbsp;HKLM\SOFTWARE\MyApp\LicenseKey) to Key Vault secrets&lt;/LI&gt;
&lt;LI&gt;Your application reads the registry as before; Managed Instance redirects the read to Key Vault transparently&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;&lt;STRONG&gt;Storage Adapters&lt;/STRONG&gt;&amp;nbsp;(three types):&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Type&lt;/th&gt;&lt;th&gt;Description&lt;/th&gt;&lt;th&gt;Credentials&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;AzureFiles&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Mount Azure Files SMB share as a drive letter&lt;/td&gt;&lt;td&gt;Storage account key in Key Vault&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Custom&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Mount storage over private endpoint via VNET&lt;/td&gt;&lt;td&gt;Requires VNET integration&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;LocalStorage&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Allocate local SSD on the Managed Instance as a drive letter&lt;/td&gt;&lt;td&gt;None needed&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 33.33%" /&gt;&lt;col style="width: 33.33%" /&gt;&lt;col style="width: 33.33%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;P&gt;The template also includes:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Managed Identity configuration&lt;/LI&gt;
&lt;LI&gt;RBAC role assignments guidance (Key Vault Secrets User, Storage File Data SMB Share Contributor, etc.)&lt;/LI&gt;
&lt;LI&gt;Deployment CLI commands ready to copy-paste&lt;/LI&gt;
&lt;/UL&gt;
&lt;H3&gt;Phase 4 — Deployment Planning &amp;amp; Packaging&lt;/H3&gt;
&lt;H4&gt;9.&amp;nbsp;plan_deployment&lt;/H4&gt;
&lt;P&gt;&lt;STRONG&gt;Purpose&lt;/STRONG&gt;: Plan the Azure App Service deployment — plans, SKUs, site assignments.&lt;/P&gt;
&lt;P&gt;Collects your Azure details (subscription, resource group, region) and creates a validated deployment plan:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Assigns sites to App Service Plans&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Enforces PV4 + IsCustomMode=true for Managed Instance&lt;/STRONG&gt;&amp;nbsp;— won't let you accidentally use the wrong SKU&lt;/LI&gt;
&lt;LI&gt;Supports&amp;nbsp;single_plan&amp;nbsp;(all sites on one plan) or&amp;nbsp;multi_plan&amp;nbsp;(separate plans)&lt;/LI&gt;
&lt;LI&gt;Optionally queries Azure for existing Managed Instance plans you can reuse&lt;/LI&gt;
&lt;/UL&gt;
&lt;H4&gt;10.&amp;nbsp;package_site&lt;/H4&gt;
&lt;P&gt;&lt;STRONG&gt;Purpose&lt;/STRONG&gt;: Package IIS site content into ZIP files for deployment.&lt;/P&gt;
&lt;P&gt;Calls&amp;nbsp;Get-SitePackage.ps1&amp;nbsp;to:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Compress site binaries +&amp;nbsp;web.config&amp;nbsp;into deployment-ready ZIPs&lt;/LI&gt;
&lt;LI&gt;Optionally inject&amp;nbsp;install.ps1&amp;nbsp;into the package (so it deploys alongside the app)&lt;/LI&gt;
&lt;LI&gt;Handle sites with non-fatal issues (configurable)&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;&lt;STRONG&gt;Size limit&lt;/STRONG&gt;: 2 GB per site (enforced by System.IO.Compression).&lt;/P&gt;
&lt;H4&gt;11.&amp;nbsp;generate_migration_settings&lt;/H4&gt;
&lt;P&gt;&lt;STRONG&gt;Purpose&lt;/STRONG&gt;: Create the&amp;nbsp;MigrationSettings.json&amp;nbsp;deployment configuration.&lt;/P&gt;
&lt;P&gt;This is the final configuration artifact. It calls&amp;nbsp;Generate-MigrationSettings.ps1&amp;nbsp;and then post-processes the output to inject Managed Instance-specific fields:&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Important&lt;/STRONG&gt;: The Managed Instance on App Service Plan is&amp;nbsp;&lt;STRONG&gt;not automatically created&lt;/STRONG&gt;&amp;nbsp;by the migration tools. You must&amp;nbsp;&lt;STRONG&gt;pre-create the Managed Instance on App Service Plan&lt;/STRONG&gt;&amp;nbsp;(PV4 SKU with&amp;nbsp;IsCustomMode=true) in the Azure portal or via CLI before generating migration settings. When running&amp;nbsp;generate_migration_settings, provide the&amp;nbsp;&lt;STRONG&gt;name of your existing Managed Instance plan&lt;/STRONG&gt; so the settings file references it correctly.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;{ "AppServicePlan": "mi-plan-eastus", "Tier": "PremiumV4", "IsCustomMode": true, "InstallScriptPath": "install.ps1", "Region": "eastus", "Sites": [ { "IISSiteName": "MyLegacyApp", "AzureSiteName": "mylegacyapp-azure", "SitePackagePath": "packagedsites/MyLegacyApp_Content.zip" } ] }&lt;/P&gt;
&lt;H3&gt;Phase 5 — Execution&lt;/H3&gt;
&lt;H4&gt;12.&amp;nbsp;confirm_migration&lt;/H4&gt;
&lt;P&gt;&lt;STRONG&gt;Purpose&lt;/STRONG&gt;: Present a full migration summary and require explicit human confirmation.&lt;/P&gt;
&lt;P&gt;Before touching Azure, this tool displays:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Total plans and sites to be created&lt;/LI&gt;
&lt;LI&gt;SKU and pricing tier per plan&lt;/LI&gt;
&lt;LI&gt;Whether Managed Instance is configured&lt;/LI&gt;
&lt;LI&gt;Cost warning for PV4 pricing&lt;/LI&gt;
&lt;LI&gt;Resource group, region, and subscription details&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;&lt;STRONG&gt;Nothing proceeds until the user explicitly confirms.&lt;/STRONG&gt;&lt;/P&gt;
&lt;H4&gt;13.&amp;nbsp;migrate_sites&lt;/H4&gt;
&lt;P&gt;&lt;STRONG&gt;Purpose&lt;/STRONG&gt;: Deploy everything to Azure App Service.&amp;nbsp;&lt;STRONG&gt;This creates billable resources.&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Calls&amp;nbsp;Invoke-SiteMigration.ps1, which:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;Sets Azure subscription context&lt;/LI&gt;
&lt;LI&gt;Creates/validates resource groups&lt;/LI&gt;
&lt;LI&gt;Creates App Service Plans (PV4 with IsCustomMode for Managed Instance)&lt;/LI&gt;
&lt;LI&gt;Creates Web Apps&lt;/LI&gt;
&lt;LI&gt;Configures .NET version, 32-bit mode, pipeline mode from the original IIS settings&lt;/LI&gt;
&lt;LI&gt;Sets up virtual directories and applications&lt;/LI&gt;
&lt;LI&gt;Disables basic authentication (FTP + SCM) for security&lt;/LI&gt;
&lt;LI&gt;Deploys ZIP packages via Azure REST API&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;&lt;STRONG&gt;Output&lt;/STRONG&gt;:&amp;nbsp;MigrationResults.json&amp;nbsp;with per-site Azure URLs, Resource IDs, and deployment status.&lt;/P&gt;
&lt;H2&gt;The 6 Copilot Agents&lt;/H2&gt;
&lt;P&gt;The MCP tools are orchestrated by a team of specialized Copilot agents — each responsible for a specific phase of the migration lifecycle.&lt;/P&gt;
&lt;H3&gt;@iis-migrate&amp;nbsp;— The Orchestrator&lt;/H3&gt;
&lt;P&gt;The root agent that guides the entire migration. It:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Tracks progress across all 5 phases using a todo list&lt;/LI&gt;
&lt;LI&gt;Delegates work to specialist subagents&lt;/LI&gt;
&lt;LI&gt;Gates between phases — asks before transitioning&lt;/LI&gt;
&lt;LI&gt;Enforces the Managed Instance constraint (PV4 + IsCustomMode) at every decision point&lt;/LI&gt;
&lt;LI&gt;Never skips the Phase 5 confirmation gate&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;&lt;STRONG&gt;Usage&lt;/STRONG&gt;: Open Copilot Chat and type&amp;nbsp;@iis-migrate I want to migrate my IIS applications to Azure&lt;/P&gt;
&lt;H3&gt;iis-discover&amp;nbsp;— Discovery Specialist&lt;/H3&gt;
&lt;P&gt;Handles Phase 1. Runs&amp;nbsp;discover_iis_sites, presents a summary table of all sites with their readiness status, and asks whether to assess or skip to packaging. Returns&amp;nbsp;readiness_results_path&amp;nbsp;and per-site routing plans.&lt;/P&gt;
&lt;H3&gt;iis-assess&amp;nbsp;— Assessment Specialist&lt;/H3&gt;
&lt;P&gt;Handles Phase 2. Runs&amp;nbsp;assess_site_readiness&amp;nbsp;for every site, and&amp;nbsp;assess_source_code&amp;nbsp;when AppCat results are available. Merges findings, highlights Managed Instance-relevant issues, and produces the adapter/install features lists that drive Phase 3.&lt;/P&gt;
&lt;H3&gt;iis-recommend&amp;nbsp;— Recommendation Specialist&lt;/H3&gt;
&lt;P&gt;Handles Phase 3. Runs&amp;nbsp;recommend_target&amp;nbsp;for each site, then conditionally generates&amp;nbsp;install.ps1&amp;nbsp;and ARM adapter templates. Presents all recommendations with confidence levels and reasoning, and allows you to edit generated artifacts.&lt;/P&gt;
&lt;H3&gt;iis-deploy-plan&amp;nbsp;— Deployment Planning Specialist&lt;/H3&gt;
&lt;P&gt;Handles Phase 4. Collects Azure details, runs&amp;nbsp;plan_deployment,&amp;nbsp;package_site, and&amp;nbsp;generate_migration_settings. Validates Managed Instance configuration, allows review and editing of MigrationSettings.json.&amp;nbsp;&lt;STRONG&gt;Does not execute migration.&lt;/STRONG&gt;&lt;/P&gt;
&lt;H3&gt;iis-execute&amp;nbsp;— Execution Specialist&lt;/H3&gt;
&lt;P&gt;Handles Phase 5 only. Runs&amp;nbsp;confirm_migration&amp;nbsp;to present the final summary, then&amp;nbsp;&lt;STRONG&gt;only proceeds with&amp;nbsp;migrate_sites&amp;nbsp;after receiving explicit "yes" confirmation.&lt;/STRONG&gt;&amp;nbsp;Reports results with Azure URLs and deployment status.&lt;/P&gt;
&lt;H2&gt;The Managed Instance Provisioning Split: A Critical Concept&lt;/H2&gt;
&lt;P&gt;One of the most important ideas Managed Instance introduces is the&amp;nbsp;&lt;STRONG&gt;provisioning split&lt;/STRONG&gt;&amp;nbsp;— the division of OS dependencies into two categories that are configured through different mechanisms:&lt;/P&gt;
&lt;P&gt;┌──────────────────────────────────────────────────────────────┐ │ MANAGED INSTANCE PROVISIONING SPLIT │ ├─────────────────────────────┬────────────────────────────────┤ │ ARM Template │ install.ps1 │ │ (Platform-Level) │ (OS-Level) │ ├─────────────────────────────┼────────────────────────────────┤ │ Registry Adapters │ COM/MSI Registration │ │ → Key Vault secrets │ → regsvr32, RegAsm, msiexec │ │ │ │ │ Storage Mounts │ SMTP Server Feature │ │ → Azure Files │ → Install-WindowsFeature │ │ → Local SSD │ │ │ → VNET private storage │ MSMQ │ │ │ → Message queue setup │ │ │ │ │ │ Crystal Reports Runtime │ │ │ → SAP MSI installer │ │ │ │ │ │ Custom Fonts │ │ │ → Copy to C:\Windows\Fonts │ └─────────────────────────────┴────────────────────────────────┘&lt;/P&gt;
&lt;P&gt;The MCP server handles this split automatically:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;assess_source_code&amp;nbsp;detects which dependencies fall into which category&lt;/LI&gt;
&lt;LI&gt;recommend_target&amp;nbsp;reports both&amp;nbsp;adapter_features&amp;nbsp;and&amp;nbsp;install_script_features&lt;/LI&gt;
&lt;LI&gt;generate_adapter_arm_template&amp;nbsp;builds the ARM template for platform features&lt;/LI&gt;
&lt;LI&gt;generate_install_script&amp;nbsp;builds the PowerShell startup script for OS features&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;You don't need to remember which goes where — the system decides and generates the right artifacts.&lt;/P&gt;
&lt;H2&gt;End-to-End Walkthrough: From Discovery to Running on Managed Instance&lt;/H2&gt;
&lt;P&gt;Here's what a complete migration conversation looks like:&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;You&lt;/STRONG&gt;:&amp;nbsp;&lt;EM&gt;"@iis-migrate I want to migrate my IIS applications to Azure"&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Phase 1 — Discovery&lt;/STRONG&gt;:&lt;/P&gt;
&lt;P&gt;Agent runs&amp;nbsp;discover_iis_sites, presents a table:&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Site&lt;/th&gt;&lt;th&gt;Status&lt;/th&gt;&lt;th&gt;Framework&lt;/th&gt;&lt;th&gt;Source Code?&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;HRPortal&lt;/td&gt;&lt;td&gt;READY_WITH_ISSUES&lt;/td&gt;&lt;td&gt;v4.8&lt;/td&gt;&lt;td&gt;Yes (.sln found)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;PayrollAPI&lt;/td&gt;&lt;td&gt;READY&lt;/td&gt;&lt;td&gt;v4.8&lt;/td&gt;&lt;td&gt;No&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;IntranetCMS&lt;/td&gt;&lt;td&gt;BLOCKED (&amp;gt;2GB)&lt;/td&gt;&lt;td&gt;v4.7.2&lt;/td&gt;&lt;td&gt;No&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 25.00%" /&gt;&lt;col style="width: 25.00%" /&gt;&lt;col style="width: 25.00%" /&gt;&lt;col style="width: 25.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;P&gt;&lt;STRONG&gt;Phase 2 — Assessment&lt;/STRONG&gt;:&lt;/P&gt;
&lt;P&gt;Agent runs&amp;nbsp;assess_site_readiness&amp;nbsp;for HRPortal — finds GACCheck and RegistryCheck failures. Runs&amp;nbsp;assess_source_code&amp;nbsp;using AppCat report — confirms COM interop, registry access, and SMTP usage.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Phase 3 — Recommendation&lt;/STRONG&gt;:&lt;/P&gt;
&lt;P&gt;Agent runs&amp;nbsp;recommend_target:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;HRPortal → MI_AppService&lt;/STRONG&gt;&amp;nbsp;(high confidence) — COM, registry, SMTP dependencies&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;PayrollAPI → AppService&lt;/STRONG&gt;&amp;nbsp;(high confidence) — no OS dependencies&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Generates&amp;nbsp;install.ps1&amp;nbsp;for HRPortal (SMTP + COM sections). Generates ARM template with registry adapter (Key Vault-backed) for HRPortal.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Phase 4 — Deployment Planning&lt;/STRONG&gt;:&lt;/P&gt;
&lt;P&gt;Agent collects subscription/RG/region, validates PV4 availability. Packages both sites. Generates MigrationSettings.json with two plans:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;mi-plan-hrportal&amp;nbsp;(PremiumV4, IsCustomMode=true) — HRPortal&lt;/LI&gt;
&lt;LI&gt;std-plan-payrollapi&amp;nbsp;(PremiumV2) — PayrollAPI&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;&lt;STRONG&gt;Phase 5 — Execution&lt;/STRONG&gt;:&lt;/P&gt;
&lt;P&gt;Agent shows full summary with cost projection. You type "yes". Sites deploy. You get Azure URLs within minutes.&lt;/P&gt;
&lt;H2&gt;Prerequisites &amp;amp; Setup&lt;/H2&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Requirement&lt;/th&gt;&lt;th&gt;Purpose&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Windows Server with IIS&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Source server for discovery and packaging&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;PowerShell 5.1&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Runs migration scripts (ships with Windows)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Python 3.10+&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;MCP server runtime&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Administrator privileges&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Required for IIS discovery, packaging, and migration&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Azure subscription&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Target for deployment (execution phase only)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Azure PowerShell (Az&amp;nbsp;module)&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Deploy to Azure (execution phase only)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;&lt;A href="https://appmigration.microsoft.com/api/download/psscripts/AppServiceMigrationScripts.zip" target="_blank" rel="noopener"&gt;Migration Scripts ZIP&lt;/A&gt;&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Microsoft's PowerShell migration scripts&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;&lt;A href="https://learn.microsoft.com/en-us/azure/migrate/appcat/dotnet" target="_blank" rel="noopener"&gt;AppCat CLI&lt;/A&gt;&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Source code analysis (optional)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;&lt;A href="https://pypi.org/project/mcp/" target="_blank" rel="noopener"&gt;FastMCP&lt;/A&gt;&lt;/STRONG&gt;&amp;nbsp;(mcp[cli]&amp;gt;=1.0.0)&lt;/td&gt;&lt;td&gt;MCP server framework&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 50.00%" /&gt;&lt;col style="width: 50.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;H2&gt;Data Flow &amp;amp; Artifacts&lt;/H2&gt;
&lt;P&gt;Every phase produces JSON artifacts that chain into the next phase:&lt;/P&gt;
&lt;P&gt;Phase 1: discover_iis_sites ──→ ReadinessResults.json&lt;/P&gt;
&lt;P&gt;│&lt;/P&gt;
&lt;P&gt;Phase 2: assess_site_readiness ◄──────┘&lt;/P&gt;
&lt;P&gt;assess_source_code ───→ Assessment JSONs&lt;/P&gt;
&lt;P&gt;│&lt;/P&gt;
&lt;P&gt;Phase 3: recommend_target ◄───────────┘&lt;/P&gt;
&lt;P&gt;generate_install_script ──→ install.ps1&lt;/P&gt;
&lt;P&gt;generate_adapter_arm ─────→ mi-adapters-template.json&lt;/P&gt;
&lt;P&gt;│&lt;/P&gt;
&lt;P&gt;Phase 4: package_site ────────────→ PackageResults.json + site ZIPs&lt;/P&gt;
&lt;P&gt;generate_migration_settings → MigrationSettings.json&lt;/P&gt;
&lt;P&gt;│&lt;/P&gt;
&lt;P&gt;Phase 5: confirm_migration ◄──────────┘&lt;/P&gt;
&lt;P&gt;migrate_sites ───────────→ MigrationResults.json&lt;/P&gt;
&lt;P&gt;│&lt;/P&gt;
&lt;P&gt;▼&lt;/P&gt;
&lt;P&gt;Apps live on Azure&lt;/P&gt;
&lt;P&gt;*.azurewebsites.net&lt;/P&gt;
&lt;P&gt;Each artifact is inspectable, editable, and auditable — providing a complete record of what was assessed, recommended, and deployed.&lt;/P&gt;
&lt;H2&gt;Error Handling&lt;/H2&gt;
&lt;P&gt;The MCP server classifies errors into actionable categories:&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Error&lt;/th&gt;&lt;th&gt;Cause&lt;/th&gt;&lt;th&gt;Resolution&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;ELEVATION_REQUIRED&lt;/td&gt;&lt;td&gt;Not running as Administrator&lt;/td&gt;&lt;td&gt;Restart VS Code / terminal as Admin&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;IIS_NOT_FOUND&lt;/td&gt;&lt;td&gt;IIS or WebAdministration module missing&lt;/td&gt;&lt;td&gt;Install IIS role + WebAdministration&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;AZURE_NOT_AUTHENTICATED&lt;/td&gt;&lt;td&gt;Not logged into Azure PowerShell&lt;/td&gt;&lt;td&gt;Run&amp;nbsp;Connect-AzAccount&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;SCRIPT_NOT_FOUND&lt;/td&gt;&lt;td&gt;Migration scripts path not configured&lt;/td&gt;&lt;td&gt;Run&amp;nbsp;configure_scripts_path&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;SCRIPT_TIMEOUT&lt;/td&gt;&lt;td&gt;PowerShell script exceeded time limit&lt;/td&gt;&lt;td&gt;Check IIS server responsiveness&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;OUTPUT_NOT_FOUND&lt;/td&gt;&lt;td&gt;Expected JSON output wasn't created&lt;/td&gt;&lt;td&gt;Verify script execution succeeded&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 33.33%" /&gt;&lt;col style="width: 33.33%" /&gt;&lt;col style="width: 33.33%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;H2&gt;Conclusion&lt;/H2&gt;
&lt;P&gt;The IIS Migration MCP Server turns what used to be a multi-week, expert-driven project into a guided conversation. It combines Microsoft's battle-tested migration PowerShell scripts with AI orchestration that understands the nuances of Managed Instance on App Service — the provisioning split, the PV4 constraint, the adapter configurations, and the OS-level customizations.&lt;/P&gt;
&lt;P&gt;Whether you're migrating 1 site or 10, agentic migration reduces risk, eliminates guesswork, and produces auditable artifacts at every step. The human stays in control; the AI handles the complexity.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Get started&lt;/STRONG&gt;: Download the&amp;nbsp;&lt;A href="https://appmigration.microsoft.com/api/download/psscripts/AppServiceMigrationScripts.zip" target="_blank" rel="noopener"&gt;migration scripts&lt;/A&gt;, set up the MCP server, and ask&amp;nbsp;@iis-migrate&amp;nbsp;to discover your IIS sites. The agents will take it from there.&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;This project is compatible with any MCP-enabled client: VS Code GitHub Copilot, Claude Desktop, Cursor, and more. The intelligence travels with the server, not the client.&lt;/EM&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 06 Apr 2026 23:58:35 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/agentic-iis-migration-to-managed-instance-on-azure-app-service/ba-p/4508969</guid>
      <dc:creator>Gaurav-Seth</dc:creator>
      <dc:date>2026-04-06T23:58:35Z</dc:date>
    </item>
    <item>
      <title>Azure Red Hat OpenShift: Managed Identity and Workload Identity now generally available</title>
      <link>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/azure-red-hat-openshift-managed-identity-and-workload-identity/ba-p/4504940</link>
      <description>&lt;P&gt;Azure Red Hat OpenShift now supports managed identities and workload identities as a generally available capability, so you can run OpenShift clusters and applications on Azure without long-lived service principal credentials.​&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;ARO, identity, and Azure governance&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;With GA support for managed identities and workload identities, Azure Red Hat OpenShift uses short‑lived credentials and least‑privilege access to help organizations strengthen their security posture. This approach reduces reliance on long‑lived credentials and overly broad permissions, supporting enterprise security requirements while improving how identity is managed for OpenShift workloads.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;​As an Azure-native service, Azure Red Hat OpenShift also integrates directly with &lt;A href="https://learn.microsoft.com/en-us/entra/workload-id/workload-identities-overview" target="_blank" rel="noopener"&gt;Microsoft Entra workload identities&lt;/A&gt;, Azure RBAC&lt;S&gt; &lt;/S&gt;strengthening your overall security and identity management posture.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Platform identity: managed identities for ARO operators&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;At the platform layer, ARO now uses multiple user assigned managed identities rather than a single service principal with broad rights. Each identity is mapped to a specific ARO component and associated with a dedicated built-in ARO role, so permissions are scoped according to least privilege principles and aligned with Azure RBAC best practices.​&lt;/P&gt;
&lt;P&gt;You can wire this model in several ways: create identities and role assignments up front and reference them during deployment or use the Azure portal “all-in-one” experience to have identities and assignments created for you as part of cluster creation. Clusters can be deployed using the Azure portal or ARM/Bicep templates, and the native&amp;nbsp;az aro&amp;nbsp;(the formal az aro is available in version 2.84.0 or higher) commands provide a similar end-to-end experience for CLI-driven environments.​&lt;/P&gt;
&lt;P&gt;For an architectural deep dive into operators, scopes, and role assignment patterns, see&amp;nbsp;&lt;A href="https://learn.microsoft.com/en-us/azure/openshift/howto-understand-managed-identities" target="_blank" rel="noopener"&gt;&lt;STRONG&gt;Understand managed identities in Azure Red Hat OpenShift&lt;/STRONG&gt;&lt;/A&gt;.​&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Application access: workload identity for Azure services&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Workload identity uses Kubernetes-native OIDC federation so pods can securely access an Azure Managed Identity, which remains the underlying identity governed by Azure Entra ID and Azure RBAC.&lt;/P&gt;
&lt;P&gt;For applications running on ARO, this capability provides&amp;nbsp;workload identity—a way for pods to obtain short-lived tokens for an Azure managed identity without storing secrets in the cluster. Using Microsoft Entra workload identities and OIDC federation, you bind a user assigned managed identity to a Kubernetes service account; workloads using that service account automatically receives tokens for the associated identity at runtime.​&lt;/P&gt;
&lt;P&gt;This enables very granular patterns: for example, granting a specific application read-only access to a single Key Vault, storage account, or Azure SQL database, without sharing credentials across namespaces or relying on a cluster wide service principal. Enterprise teams can use this to connect AI and data workloads on ARO to services like Azure OpenAI, Azure SQL, or Azure Storage—giving each app just the access it needs for inference, data access, or logging while staying within standard Azure governance controls. The Learn guide,&amp;nbsp;&lt;A href="https://learn.microsoft.com/en-us/azure/openshift/howto-deploy-configure-application" target="_blank" rel="noopener"&gt;&lt;STRONG&gt;Deploy and configure an application using workload identity on an Azure Red Hat OpenShift managed identity cluster&lt;/STRONG&gt;&lt;/A&gt;, walks through the workflow end-to-end.​&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Existing preview clusters and how to start&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;If you deployed ARO clusters with managed identities during the preview, no changes are required: those clusters automatically transition to GA and are fully supported for production use, with no migration or redeployment needed. You can continue to upgrade them using the standard OpenShift mechanisms, following &lt;A href="https://learn.microsoft.com/en-us/azure/openshift/howto-upgrade-aro-openshift-cluster" target="_blank" rel="noopener"&gt;the managed identity update guidance in the documentation for&lt;/A&gt; upgrade steps.​&lt;/P&gt;
&lt;P&gt;Clusters built with the current service principal model continue to receive full support; however, there is not yet a migration path available to move from service principal to managed identity. To adopt managed identity, you deploy a new ARO cluster with managed identities enabled and migrate workloads to it.&lt;/P&gt;
&lt;P&gt;To get started with new clusters, begin with&amp;nbsp;&lt;A href="https://learn.microsoft.com/en-us/azure/openshift/howto-understand-managed-identities" target="_blank" rel="noopener"&gt;&lt;STRONG&gt;Understand managed identities in Azure Red Hat OpenShift&lt;/STRONG&gt;&lt;/A&gt; to review concepts and considerations, then create a cluster using the Azure portal, an ARM/Bicep template, or the ARO CLI. Joint Red Hat–Microsoft demos and videos provide an end-to-end view of the experience, from deploying a managed identity enabled cluster through configuring workload identity for applications consuming Azure services.​&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Resources&lt;/STRONG&gt;:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;&lt;A href="https://interact.redhat.com/share/zMI66TmGGHDqWXMdo9Lk" target="_blank" rel="noopener"&gt;Interactive Demo - Create a managed identity Azure Red Hat OpenShift cluster&lt;/A&gt;&lt;/STRONG&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Concepts / architecture&lt;/STRONG&gt;&lt;BR /&gt;Understand managed identities in Azure Red Hat OpenShift&lt;BR /&gt;&lt;A href="https://learn.microsoft.com/en-us/azure/openshift/howto-understand-managed-identities" target="_blank" rel="noopener"&gt;https://learn.microsoft.com/en-us/azure/openshift/howto-understand-managed-identities&lt;/A&gt;[&lt;A href="https://learn.microsoft.com/en-us/azure/openshift/howto-understand-managed-identities" target="_blank" rel="noopener"&gt;learn.microsoft&lt;/A&gt;]​&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Cluster creation&lt;/STRONG&gt;&lt;BR /&gt;Create an Azure Red Hat OpenShift cluster with managed identities&lt;BR /&gt;&lt;A href="https://learn.microsoft.com/en-us/azure/openshift/howto-create-openshift-cluster" target="_blank" rel="noopener"&gt;https://learn.microsoft.com/en-us/azure/openshift/howto-create-openshift-cluster&lt;/A&gt;[&lt;A href="https://learn.microsoft.com/en-us/azure/openshift/howto-create-openshift-cluster" target="_blank" rel="noopener"&gt;learn.microsoft&lt;/A&gt;]​&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Applications / workload identity&lt;/STRONG&gt;&lt;BR /&gt;Deploy and configure an application using workload identity on an Azure Red Hat OpenShift managed identity cluster&lt;BR /&gt;&lt;A href="https://learn.microsoft.com/en-us/azure/openshift/howto-deploy-configure-application" target="_blank" rel="noopener"&gt;https://learn.microsoft.com/en-us/azure/openshift/howto-deploy-configure-application&lt;/A&gt;[&lt;A href="https://learn.microsoft.com/en-us/azure/openshift/howto-deploy-configure-application" target="_blank" rel="noopener"&gt;learn.microsoft&lt;/A&gt;]​&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Automation (ARM/Bicep)&lt;/STRONG&gt;&lt;BR /&gt;Deploy an Azure Red Hat OpenShift cluster with an ARM template or Bicep&lt;BR /&gt;&lt;A href="https://learn.microsoft.com/en-us/azure/openshift/quickstart-openshift-arm-bicep-template" target="_blank" rel="noopener"&gt;https://learn.microsoft.com/en-us/azure/openshift/quickstart-openshift-arm-bicep-template&lt;/A&gt;[&lt;A href="https://learn.microsoft.com/en-us/azure/openshift/quickstart-openshift-arm-bicep-template" target="_blank" rel="noopener"&gt;learn.microsoft&lt;/A&gt;]​&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Joint story from Red Hat&lt;/STRONG&gt;&lt;BR /&gt;Managed Identity and Workload Identity support in Azure Red Hat OpenShift&lt;/LI&gt;
&lt;LI&gt;&lt;A style="font-style: normal; font-weight: 400; background-color: rgb(255, 255, 255);" href="https://www.redhat.com/en/blog/general-availability-managed-identity-and-workload-identity-microsoft-azure-red-hat-openshift" target="_blank" rel="noopener"&gt;https://www.redhat.com/en/blog/general-availability-managed-identity-and-workload-identity-microsoft-azure-red-hat-openshift&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;</description>
      <pubDate>Sat, 11 Apr 2026 23:03:17 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/apps-on-azure-blog/azure-red-hat-openshift-managed-identity-and-workload-identity/ba-p/4504940</guid>
      <dc:creator>MelanieKraintz007</dc:creator>
      <dc:date>2026-04-11T23:03:17Z</dc:date>
    </item>
  </channel>
</rss>

