web apps
426 TopicsIntroducing Azure Container Apps Express!
Three years ago, a 15-second cold start was industry-leading. Today, developers and AI agents expect sub-second. The speed bar has moved, and the tooling needs to move with it. After running Azure Container Apps for years, we've learned something important: for most developers, the ACA environment is an unnecessary construct. It adds provisioning time, configuration surface, and cognitive overhead — when all you really want is to run your app with scaling, networking, and operations handled for you. At the same time, a new class of workloads has emerged. Agent-first platforms — systems where AI agents deploy endpoints on demand, spin up tool-use APIs, and tear them down when work is done — demand an even more radical focus on speed and simplicity. Every second of provisioning delay is wasted agent productivity. Today, we're launching Azure Container Apps Express in Public Preview — the fastest, simplest way to go from a container image to an internet-reachable app on Azure, ready for many production-style workloads. What Is ACA Express? ACA Express removes the infrastructure decisions. There's no environment to provision, no networking to configure, no scaling rules to write. You bring a container image, Express handles everything else. Behind the scenes, Express runs your container on pre-provisioned capacity with sensible defaults baked in — so you skip environment setup without giving up ACA's serverless model. There's more coming in this space soon — keep watching. Here's what that means in practice: Instant provisioning — your app is running in seconds, not minutes Sub-second cold starts — fast enough for interactive UIs and on-demand agent endpoints Scale to and from zero — automatic, no configuration required (full scaling controls coming soon) Per-second billing — pay only for what you use Production-ready defaults — ingress, secrets, environment variables, and observability are built in Express is purpose-built for two audiences: developers who want to ship fast (SaaS apps, APIs, web dashboards, prototypes) and agents that deploy on demand (MCP servers, tool-use endpoints, multi-step workflow APIs, human-in-the-loop UIs). If you've ever waited for an ACA environment to provision, only to realize you didn't need half of the configuration options it asked you for — Express is your answer. What You Can Do Today Note: West Central US is currently the only available region. We will expand to new regions through the coming days. Express is in Public Preview starting today. It's a deliberate early ship — there's a meaningful feature gap compared to the existing Azure Container Apps offering, and we're filling it fast. New capabilities are landing on a rapid cadence throughout the preview, and by Microsoft Build in June, Express should be close to feature-complete. For the current list of supported features, known gaps, and what's on the way, see the Express documentation. We'd rather put valuable technology in your hands early and iterate with you than wait behind closed doors for perfection. Who Is Express For? Scenario Why Express SaaS apps and APIs Deploy and scale without infrastructure planning AI app frontends Chat UIs and copilot frontends that scale with usage spikes MCP servers Expose API endpoints for AI agents in seconds Agent workflows Spin up endpoints on demand, tear down when done Prototypes and startups Go from idea to production in minutes Web dashboards Internal tools with instant availability Get Started Note: Documentation links may not yet be available yet. They will become available throughout the day. Express is available now in Public Preview. Try it: Azure Container Apps Express overview — concepts, capabilities, and the current feature support matrix Deploy your first app with the Azure CLI — step-by-step quickstart New Azure Container Apps Portal — create and manage Express apps alongside your existing Container Apps resources Have questions? Check the Azure Container Apps Express FAQ for answers to common questions about pricing, limits, regions, and the road to GA. We're building Express in the open and we want to hear from you. Tell us what features matter most, what works, and what doesn't — reach out on the Azure Container Apps GitHub or in the comments below.181Views0likes0CommentsGive Your AI Agent Eyes: Browser-Harness Meets Playwright Workspaces Remote Browsers
What happens when you hand a coding agent a real browser — not a mock, not an API wrapper, but a full Chromium instance running in the cloud? It shops for you. It scrapes for you. It navigates JavaScript-heavy SPAs that would make any REST-based scraper weep. And it does it across 10+ parallel sessions without touching your local machine. This is the story of combining two tools that were built for different worlds — and discovering they're a perfect fit. The Problem Today's coding agents — Codex, Claude Code, Copilot — are extraordinary at reading and writing code. But ask one to check if a product is in stock on an e-commerce site, and it hits a wall. Modern websites are JavaScript-rendered, authentication-gated, geolocation-aware, and hostile to simple HTTP requests. The agent needs a real browser. Not requests.get(). Not a headless puppeteer script you wrote last Tuesday. A browser that renders CSS, executes JavaScript, handles cookies, and lets the agent see what a human would see. Enter Browser-Harness Browser-harness is an open-source tool that gives AI agents direct control over a Chrome browser via the Chrome DevTools Protocol (CDP). It exposes a clean Python API: ● agent: wants to upload a file │ ● agent-workspace/agent_helpers.py → helper missing │ ● agent writes it agent_helpers.py │ + custom helper ✓ file uploaded One websocket to Chrome, nothing between. The agent writes what's missing during execution. The harness improves itself every run. But there's a catch. Where does this browser run? The Infrastructure Gap If the browser runs locally, you've got problems: Your machine is busy. Running Chrome while the agent works eats RAM and CPU. No parallelism. One browser per machine. Want to scrape 10 sites simultaneously? Buy 10 machines. No consistency. Different OS, different Chrome versions, different results. No isolation. Letting the agent run amock on autopilot with your local browser can be risky, it can reuse your creds, stored cookies and sessions No observability. The agent is clicking around in a browser you can't see. What you really want is a browser that runs somewhere else — managed, scalable, observable — and your agent just connects to it over a WebSocket. Enter Playwright Workspaces Playwright Workspaces provides exactly this: remote browser endpoints on Azure. You make an HTTP request, a Chromium instance spins up in the cloud, and you get back a WebSocket URL (wss://...) to connect via CDP. The key insight: browser-harness speaks CDP. Playwright Workspaces serves CDP. They snap together like LEGO. Your Agent → browser-harness → CDP WebSocket → Playwright Workspaces → Cloud Chromium No local Chrome needed. No browser installation. No display server. Just a WebSocket connection to a fully managed browser. The Two-Step Connection Flow Connecting them is surprisingly simple: Step 1: Provision a remote browser from playwright_service_client import get_cdp_browsers_endpoint SERVICE_URL = get_cdp_browsers_endpoint() # → https://eastus.api.playwright.microsoft.com/.../browsers?os=linux&browser=chromium&playwrightVersion=cdp&shouldRedirect=false response = requests.get(SERVICE_URL, headers={"Authorization": f"Bearer {token}"}, timeout=120) session_url = response.json()["sessionUrl"] # → wss://browser.playwright.microsoft.com/ws?token=... Step 2: Point browser-harness at it export BU_CDP_WS="wss://browser.playwright.microsoft.com/ws?token=..." browser-harness -c "print(page_info())" # → {'url': 'about:blank', 'title': '', 'w': 780, 'h': 441} That's it. Your agent now controls a cloud browser. What This Unlocks: A Real-World Demo We gave a coding agent this prompt: "Go to Website1, search for gifts under ₹500 for 10-year-old kids. Must be useful, reusable (not single-use). Delivery in Bengaluru within 3 days. Must have 5 pieces available." Here's what the agent did — autonomously, with no human intervention: Provisioned a remote Chromium browser via Playwright Workspaces Connected browser-harness to the cloud browser over WebSocket Navigated to FirstCry.com Set delivery location to Bengaluru (pincode 560001) Searched for kids' gifts Applied filters — price ₹0–250 and ₹250–500 via JavaScript DOM interaction Browsed products, rejecting single-use items (greeting cards) in favor of reusable ones (stainless steel water bottles) Checked delivery dates — rejected items with 6-day delivery, found ones with Next Day Delivery Verified stock availability — confirmed ADD TO CART was active with no stock warnings Took screenshots at every step for audit and debugging Result: Found the Brand A 600 Stainless Steel Water Bottle at ₹444.69 with next-day delivery to Bengaluru. All criteria met. The entire workflow ran on a remote browser in Azure — the local machine never launched Chrome. The Power of Remote Endpoints Why does running browsers remotely change everything? 1. Massive Parallelism Need to compare prices across 10 e-commerce sites? Spin up 10 remote browsers, assign one sub-agent per site, and scrape in parallel. Each gets its own isolated Chromium instance. No resource contention, no port conflicts. 2. Zero Local Dependencies No Chrome installation. No chromedriver version mismatches. No --no-sandbox hacks. The browser is a managed service — you just connect to it. 3. Geographic Flexibility Remote browsers run in Azure data centers. Need to see what a website looks like from East US? Or Southeast Asia? Pick your region. The browser's IP and geolocation are in the cloud, not on your laptop. 4. Ephemeral & Secure Each browser session is isolated and destroyed when the WebSocket closes. No leftover cookies, no persistent state leaking between runs. Every session starts clean. The Bigger Picture We're at an inflection point. AI agents are moving from code generation to code execution — and execution means interacting with the real world. Browsers are the universal interface to that world. The combination of browser-harness (agent-to-browser control) and Playwright Workspaces (managed remote browsers) creates a powerful primitive: give any AI agent a browser, anywhere, on demand. Get Started The full sample — including the playwright_service_client.py helper, setup prompts, and environment templates — is available here: 📦 playwright-workspaces/samples/browser-harness-webscraping Resources: Playwright Workspaces Documentation Browser-Harness GitHub Create a Playwright Workspace64Views0likes0CommentsRunning Foundry Agent Service on Azure Container Apps
Microsoft’s Customer Zero blog series gives an insider view of how Microsoft builds and operates Microsoft using our trusted, enterprise-grade agentic platform. Learn best practices from our engineering teams with real-world lessons, architectural patterns, and operational strategies for pressure-tested solutions in building, operating, and scaling AI apps and agent fleets across the organization. Challenge: Scaling agents to production changes the requirements As teams move from experimenting with AI agents to running them in production, the questions they ask begin to change. Early prototypes often focus on whether an agent can reason to generate useful output. But once agents are placed into real systems where they continuously need to serve users and respond to events, new concerns quickly take center stage: reliability, scale, observability, security, and long‑running operations. A common misconception at this stage is to think of an agent as a simple chatbot wrapped around an API. In practice, an AI agent is something very different. It is a service that listens, thinks, and acts, ingesting unstructured inputs, reasoning over context, and producing outputs that may span multiple phases. Treating agents as services means teams often need more than they initially expect: dependable compute, strong security, and real-time visibility to run agents safely and effectively at scale. When we kick off an agent loop, we provide input that informs the context it recalls for the task, the data it connects to, the tools it calls, and the reasoning steps it outlines for itself to generate an output. Agent needs are different from traditional services in hosting, scaling, identity, security, and observability; it’s a product with a probabilistic nature that requires secure, auditable access to many resources at the same lightspeed performance that users expect from any software. This isn’t the first time that the software industry needed to evolve its thinking around infrastructure. When modern application architectures began shifting from monolithic apps toward microservices, existing infrastructure wasn’t built with that model in mind. As systems were reconstructed into independent services, teams quickly discovered they needed new runtime architecture that properly accommodated microservice needs. The modern app era brought new levels of performance, reliability, and scalability of apps, but it also warranted that we rebuild app infrastructure with container orchestration and new operational patterns in mind. AI agents represent a similar inflection. Infrastructure designed for request‑response applications or stateless workloads wasn’t built with long‑running, tool‑calling, AI‑driven workflows in mind. As the builders of Foundry Agent Service, we were very aware that traditional architectures wouldn’t hold up to the bursty agentic workflows that needed to aggregate data across sources, connect to several simultaneous tools, and reason through execution plans for the output that we needed. Rather than building new infrastructure from scratch, the choice for building on Azure Container Apps was clear. With over a million Apps hosted on Azure Container Apps, it was the tried-and-true solution we needed to keep our team focused on building agent intelligence and behavior instead of the plumbing underneath. Solution: Building Foundry Agent Service on a resilient agent runtime foundation Foundry Agent Service is Microsoft’s fully managed platform for building, deploying, and scaling AI agents as production services. Builders start by choosing their preferred framework or immediately building an agent inside Foundry, while Foundry Agent Service handles the operational complexity required to run agents at scale. Let’s use the example of a sales agent in Foundry Agent Service. You might have a salesperson who prompts a sales agent with “Help me prepare for my upcoming meeting with customer Contoso.” The agent is going to kick off several processes across data and tools to generate the best answer: Work IQ to understand Teams conversations with Contoso, Fabric IQ for current product usage and forecast trends, Foundry IQ to do an AI search over internal sales materials, and even GitHub Copilot SDK to generate and execute code that can draft PowerPoint and Word artifacts for the meeting. And this is just one agent; more than 20,000 customers rely on Foundry Agent Service. At the core of Foundry Agent Service is a dedicated agent runtime through Azure Container Apps that explicitly meets our demands for production agents. Agent runtime through flexible cloud infrastructure allows builders to focus on making powerful agent experiences without worrying about under-the-hood compute and configurations. This runtime is built around five foundational pillars: Fast startup and resume. Agents are event‑driven and often bursty. Responsiveness depends on the ability to start or resume execution quickly when events arrive. Built‑in agent tool execution. Agents must securely execute tool calls like APIs, workflows, and services as part of their reasoning process, without fragile glue code or ad‑hoc orchestration. State persistence and restore. Many agent workflows are long‑running and multi‑phase. The runtime must allow agents to reason, pause, and resume with safely preserved state. Strong isolation per agent task. As agents execute code and tools dynamically, isolation is critical to prevent data leakage and contain blast radius. Secure by default. Identity, access, and execution controls are enforced at the runtime layer rather than bolted on after the fact. Together, these pillars define what it means to run AI agents as first‑class production services. Impact: How Azure Container Apps powers agent runtime Building and operating agent infrastructure from scratch introduces unnecessary complexity and risk. Azure Container Apps has been pressure‑tested at Microsoft scale, proving to be a powerful, serverless foundation for running AI workloads and aligns naturally with the needs of agent runtime. It provides serverless, event‑driven scaling with fast startup and scale‑to‑zero, which is critical for agents with unpredictable execution patterns. Execution is secure by default, with built‑in identity, isolation, and security boundaries enforced at the platform layer. Azure Container Apps natively supports running MCP servers and executing full agent workflows, while Container Apps jobs enable on‑demand tool execution for discrete units of work without custom orchestration. For scenarios involving AI‑generated or untrusted code, dynamic sessions allow execution in isolated sandboxes, keeping blast radius contained. Azure Container Apps also supports running model inference directly within the container boundary, helping preserve data residency and reduce unnecessary data movement. Learnings for your agent runtime foundation Make infrastructure flexible with serverless architecture. AI systems move too fast to create infrastructure from scratch. With bursty, unpredictable agent workloads, sub‑second startup times and serverless scaling are critical. Simplify heavy lifting. Developers should focus on agent behavior, tool invocation, and workflow design instead of infrastructure plumbing. Using trusted cloud infrastructure, pain points like making sure agents run in isolated sandboxes, properly applying security policy to agent IDs, and ensuring secure connections to virtual networks are already solved. When you simplify the operational overhead, you make it easier for developers to focus on meaningful innovation. Invest in visibility and monitoring. Strong observability enables faster iteration, safer evolution, and continuous self‑correction for both humans and agents as systems adapt over time. Want to learn more? Learn about building and hosting agents with Foundry Agent Service Discover agent runtime through Azure Container Apps Read about best practices for managing agents108Views0likes0CommentsAgentic IIS Migration to Managed Instance on Azure App Service
Introduction Enterprises running ASP.NET Framework workloads on Windows Server with IIS face a familiar dilemma: modernize or stay put. The applications work, the infrastructure is stable, and nobody wants to be the person who breaks production during a cloud migration. But the cost of maintaining aging on-premises servers, patching Windows, and managing IIS keeps climbing. Azure App Service has long been the lift-and-shift destination for these workloads. But what about applications that depend on Windows registry keys, COM components, SMTP relay, MSMQ queues, local file system access, or custom fonts? These OS-level dependencies have historically been migration blockers — forcing teams into expensive re-architecture or keeping them anchored to VMs. Managed Instance on Azure App Service changes this equation entirely. And the IIS Migration MCP Server makes migration guided, intelligent, and safe — with AI agents that know what to ask, what to check, and what to generate at every step. What Is Managed Instance on Azure App Service? Managed Instance on App Service is Azure's answer to applications that need OS-level customization beyond what standard App Service provides. It runs on the PremiumV4 (PV4) SKU with IsCustomMode=true, giving your app access to: Capability What It Enables Registry Adapters Redirect Windows Registry reads to Azure Key Vault secrets — no code changes Storage Adapters Mount Azure Files, local SSD, or private VNET storage as drive letters (e.g., D:\, E:\) install.ps1 Startup Script Run PowerShell at instance startup to install Windows features (SMTP, MSMQ), register COM components, install MSI packages, deploy custom fonts Custom Mode Full access to the Windows instance for configuration beyond standard PaaS guardrails The key constraint: Managed Instance on App Service requires PV4 SKU with IsCustomMode=true. No other SKU combination supports it. Why Managed Instance Matters for Legacy Apps Consider a classic enterprise ASP.NET application that: Reads license keys from HKLM\SOFTWARE\MyApp in the Windows Registry Uses a COM component for PDF generation registered via regsvr32 Sends email through a local SMTP relay Writes reports to D:\Reports\ on a local drive Uses a custom corporate font for PDF rendering With standard App Service, you'd need to rewrite every one of these dependencies. With Managed Instance on App Service, you can: Map registry reads to Key Vault secrets via Registry Adapters Mount Azure Files as D:\ via Storage Adapters Enable SMTP Server via install.ps1 Register the COM DLL via install.ps1 (regsvr32) Install the custom font via install.ps1 Please note that when you are migrating your web applications to Managed Instance on Azure App Service in majority of the use cases "Zero application code changes may be required " but depending on your specific web app some code changes may be necessary. Microsoft Learn Resources Managed Instance on App Service Overview Azure App Service Documentation App Service Migration Assistant Tool Migrate to Azure App Service Azure App Service Plans Overview PremiumV4 Pricing Tier Azure Key Vault Azure Files AppCat (.NET) — Azure Migrate Application and Code Assessment Why Agentic Migration? The Case for AI-Guided IIS Migration The Problem with Traditional Migration Microsoft provides excellent PowerShell scripts for IIS migration — Get-SiteReadiness.ps1, Get-SitePackage.ps1, Generate-MigrationSettings.ps1, and Invoke-SiteMigration.ps1. They're free, well-tested, and reliable. So why wrap them in an AI-powered system? Because the scripts are powerful but not intelligent. They execute what you tell them to. They don't tell you what to do. Here's what a traditional migration looks like: Run readiness checks — get a wall of JSON with cryptic check IDs like ContentSizeCheck, ConfigErrorCheck, GACCheck Manually interpret 15+ readiness checks per site across dozens of sites Decide whether each site needs Managed Instance or standard App Service (how?) Figure out which dependencies need registry adapters vs. storage adapters vs. install.ps1 (the "Managed Instance provisioning split") Write the install.ps1 script by hand for each combination of OS features Author ARM templates for adapter configurations (Key Vault references, storage mount specs, RBAC assignments) Wire together PackageResults.json → MigrationSettings.json with correct Managed Instance fields (Tier=PremiumV4, IsCustomMode=true) Hope you didn't misconfigure anything before deploying to Azure Even experienced Azure engineers find this time-consuming, error-prone, and tedious — especially across a fleet of 20, 50, or 100+ IIS sites. What Agentic Migration Changes The IIS Migration MCP Server introduces an AI orchestration layer that transforms this manual grind into a guided conversation: Traditional Approach Agentic Approach Read raw JSON output from scripts AI summarizes readiness as tables with plain-English descriptions Memorize 15 check types and their severity AI enriches each check with title, description, recommendation, and documentation links Manually decide Managed Instance vs App Service recommend_target analyzes all signals and recommends with confidence + reasoning Write install.ps1 from scratch generate_install_script builds it from detected features Author ARM templates manually generate_adapter_arm_template generates full templates with RBAC guidance Wire JSON artifacts between phases by hand Agents pass readiness_results_path → package_results_path → migration_settings_path automatically Pray you set PV4 + IsCustomMode correctly Enforced automatically — every tool validates Managed Instance constraints Deploy and find out what broke confirm_migration presents a full cost/resource summary before touching Azure The core value proposition: the AI knows the Managed Instance provisioning split. It knows that registry access needs an ARM template with Key Vault-backed adapters, while SMTP needs an install.ps1 section enabling the Windows SMTP Server feature. You don't need to know this. The system detects it from your IIS configuration and AppCat analysis, then generates exactly the right artifacts. Human-in-the-Loop Safety Agentic doesn't mean autonomous. The system has explicit gates: Phase 1 → Phase 2: "Do you want to assess these sites, or skip to packaging?" Phase 3: "Here's my recommendation — Managed Instance for Site A (COM + Registry), standard for Site B. Agree?" Phase 4: "Review MigrationSettings.json before proceeding" Phase 5: "This will create billable Azure resources. Type 'yes' to confirm" The AI accelerates the workflow; the human retains control over every decision. Quick Start Clone and set up the MCP server git clone https://github.com/gsethdev/agenticmigration.git cd iis-migration-mcp python -m venv .venv .venv\Scripts\activate pip install -r requirements.txt # Download Microsoft's migration scripts (NOT included in this repo) # From: https://appmigration.microsoft.com/api/download/psscripts/AppServiceMigrationScripts.zip # Unzip to C:\MigrationScripts (or your preferred path) # Start using in VS Code with Copilot # 1. Copy .vscode/mcp.json.example → .vscode/mcp.json # 2. Open folder in VS Code # 3. In Copilot Chat: "Configure scripts path to C:\MigrationScripts" # 4. Then: @iis-migrate "Discover my IIS sites" The server also works with any MCP-compatible client — Claude Desktop, Cursor, Copilot CLI, or custom integrations — via stdio transport. Architecture: How the MCP Server Works The system is built on the Model Context Protocol (MCP), an open protocol that lets AI assistants like GitHub Copilot, Claude, or Cursor call external tools through a standardized interface. ┌──────────────────────────────────────────────────────────────────┐ │ VS Code + Copilot Chat │ │ @iis-migrate orchestrator agent │ │ ├── iis-discover (Phase 1) │ │ ├── iis-assess (Phase 2) │ │ ├── iis-recommend (Phase 3) │ │ ├── iis-deploy-plan (Phase 4) │ │ └── iis-execute (Phase 5) │ └─────────────┬────────────────────────────────────────────────────┘ │ stdio JSON-RPC (MCP Transport) ▼ ┌──────────────────────────────────────────────────────────────────┐ │ FastMCP Server (server.py) │ │ 13 Python Tool Modules (tools/*.py) │ │ └── ps_runner.py (Python → PowerShell bridge) │ │ └── Downloaded PowerShell Scripts (user-configured) │ │ ├── Local IIS (discovery, packaging) │ │ └── Azure ARM API (deployment) │ └──────────────────────────────────────────────────────────────────┘ The server exposes 13 MCP tools organized across 5 phases, orchestrated by 6 Copilot agents (1 orchestrator + 5 specialist subagents). Important: The PowerShell migration scripts are not included in this repository. Users must download them from GitHub and configure the path using the configure_scripts_path tool. This ensures you always use the latest version of Microsoft's scripts, avoiding version mismatch issues. The 13 MCP Tools: Complete Reference Phase 0 — Setup configure_scripts_path Purpose: Point the server to Microsoft's downloaded migration PowerShell scripts. Before any migration work, you need to download the scripts from GitHub, unzip them, and tell the server where they are. "Configure scripts path to C:\MigrationScripts" Phase 1 — Discovery 1. discover_iis_sites Purpose: Scan the local IIS server and run readiness checks on every web site. This is the entry point for every migration. It calls Get-SiteReadiness.ps1 under the hood, which: Enumerates all IIS web sites, application pools, bindings, and virtual directories Runs 15 readiness checks per site (config errors, HTTPS bindings, non-HTTP protocols, TCP ports, location tags, app pool settings, app pool identity, virtual directories, content size, global modules, ISAPI filters, authentication, framework version, connection strings, and more) Detects source code artifacts (.sln, .csproj, .cs, .vb) near site physical paths Output: ReadinessResults.json with per-site status: Status Meaning READY No issues detected — clear for migration READY_WITH_WARNINGS Minor issues that won't block migration READY_WITH_ISSUES Non-fatal issues that need attention BLOCKED Fatal issues (e.g., content > 2GB) — cannot migrate as-is Requires: Administrator privileges, IIS installed. 2. choose_assessment_mode Purpose: Route each discovered site into the appropriate next step. After discovery, you decide the path for each site: assess_all: Run detailed assessment on all non-blocked sites package_and_migrate: Skip assessment, proceed directly to packaging (for sites you already know well) The tool classifies each site into one of five actions: assess_config_only — IIS/web.config analysis assess_config_and_source — Config + AppCat source code analysis (when source is detected) package — Skip to packaging blocked — Fatal errors, cannot proceed skip — User chose to exclude Phase 2 — Assessment 3. assess_site_readiness Purpose: Get a detailed, human-readable readiness assessment for a specific site. Takes the raw readiness data from Phase 1 and enriches each check with: Title: Plain-English name (e.g., "Global Assembly Cache (GAC) Dependencies") Description: What the check found and why it matters Recommendation: Specific guidance on how to resolve the issue Category: Grouping (Configuration, Security, Compatibility) Documentation Link: Microsoft Learn URL for further reading This enrichment comes from WebAppCheckResources.resx, an XML resource file that maps check IDs to detailed metadata. Without this tool, you'd see GACCheck: FAIL — with it, you see the full context. Output: Overall status, enriched failed/warning checks, framework version, pipeline mode, binding details. 4. assess_source_code Purpose: Analyze an Azure Migrate application and code assessment for .NET JSON report to identify Managed Instance-relevant source code dependencies. If your application has source code and you've run the assessment tool against it, this tool parses the results and maps findings to migration actions: Dependency Detected Migration Action Windows Registry access Registry Adapter (ARM template) Local file system I/O / hardcoded paths Storage Adapter (ARM template) SMTP usage install.ps1 (SMTP Server feature) COM Interop install.ps1 (regsvr32/RegAsm) Global Assembly Cache (GAC) install.ps1 (GAC install) Message Queuing (MSMQ) install.ps1 (MSMQ feature) Certificate access Key Vault integration The tool matches rules from the assessment output against known Managed Instance-relevant patterns. For a complete list of rules and categories, see Interpret the analysis results. Output: Issues categorized as mandatory/optional/potential, plus install_script_features and adapter_features lists that feed directly into Phase 3 tools. Phase 3 — Recommendation & Provisioning 5. suggest_migration_approach Purpose: Recommend the right migration tool/approach for the scenario. This is a routing tool that considers: Source code available? → Recommend the App Modernization MCP server for code-level changes No source code? → Recommend this IIS Migration MCP (lift-and-shift) OS customization needed? → Highlight Managed Instance on App Service as the target 6. recommend_target Purpose: Recommend the Azure deployment target for each site based on all assessment data. This is the intelligence center of the system. It analyzes config assessments and source code findings to recommend: Target When Recommended SKU MI_AppService Registry, COM, MSMQ, SMTP, local file I/O, GAC, or Windows Service dependencies detected PremiumV4 (PV4) AppService Standard web app, no OS-level dependencies PremiumV2 (PV2) ContainerApps Microservices architecture or container-first preference N/A Each recommendation comes with: Confidence: high or medium Reasoning: Full explanation of why this target was chosen Managed Instance reasons: Specific dependencies that require Managed Instance Blockers: Issues that prevent migration entirely install_script_features: What the install.ps1 needs to enable adapter_features: What the ARM template needs to configure Provisioning guidance: Step-by-step instructions for what to do next 7. generate_install_script Purpose: Generate an install.ps1 PowerShell script for OS-level feature enablement on Managed Instance. This handles the OS-level side of the Managed Instance provisioning split. It generates a startup script that includes sections for: Feature What the Script Does SMTP Install-WindowsFeature SMTP-Server, configure smart host relay MSMQ Install MSMQ, create application queues COM/MSI Run msiexec for MSI installers, regsvr32/RegAsm for COM registration Crystal Reports Install SAP Crystal Reports runtime MSI Custom Fonts Copy .ttf/.otf to C:\Windows\Fonts, register in registry The script can auto-detect needed features from config and source assessments, or you can specify them manually. 8. generate_adapter_arm_template Purpose: Generate an ARM template for Managed Instance registry and storage adapters. This handles the platform-level side of the Managed Instance provisioning split. It generates a deployable ARM template that configures: Registry Adapters (Key Vault-backed): Map Windows Registry paths (e.g., HKLM\SOFTWARE\MyApp\LicenseKey) to Key Vault secrets Your application reads the registry as before; Managed Instance redirects the read to Key Vault transparently Storage Adapters (three types): Type Description Credentials AzureFiles Mount Azure Files SMB share as a drive letter Storage account key in Key Vault Custom Mount storage over private endpoint via VNET Requires VNET integration LocalStorage Allocate local SSD on the Managed Instance as a drive letter None needed The template also includes: Managed Identity configuration RBAC role assignments guidance (Key Vault Secrets User, Storage File Data SMB Share Contributor, etc.) Deployment CLI commands ready to copy-paste Phase 4 — Deployment Planning & Packaging 9. plan_deployment Purpose: Plan the Azure App Service deployment — plans, SKUs, site assignments. Collects your Azure details (subscription, resource group, region) and creates a validated deployment plan: Assigns sites to App Service Plans Enforces PV4 + IsCustomMode=true for Managed Instance — won't let you accidentally use the wrong SKU Supports single_plan (all sites on one plan) or multi_plan (separate plans) Optionally queries Azure for existing Managed Instance plans you can reuse 10. package_site Purpose: Package IIS site content into ZIP files for deployment. Calls Get-SitePackage.ps1 to: Compress site binaries + web.config into deployment-ready ZIPs Optionally inject install.ps1 into the package (so it deploys alongside the app) Handle sites with non-fatal issues (configurable) Size limit: 2 GB per site (enforced by System.IO.Compression). 11. generate_migration_settings Purpose: Create the MigrationSettings.json deployment configuration. This is the final configuration artifact. It calls Generate-MigrationSettings.ps1 and then post-processes the output to inject Managed Instance-specific fields: Important: The Managed Instance on App Service Plan is not automatically created by the migration tools. You must pre-create the Managed Instance on App Service Plan (PV4 SKU with IsCustomMode=true) in the Azure portal or via CLI before generating migration settings. When running generate_migration_settings, provide the name of your existing Managed Instance plan so the settings file references it correctly. { "AppServicePlan": "mi-plan-eastus", "Tier": "PremiumV4", "IsCustomMode": true, "InstallScriptPath": "install.ps1", "Region": "eastus", "Sites": [ { "IISSiteName": "MyLegacyApp", "AzureSiteName": "mylegacyapp-azure", "SitePackagePath": "packagedsites/MyLegacyApp_Content.zip" } ] } Phase 5 — Execution 12. confirm_migration Purpose: Present a full migration summary and require explicit human confirmation. Before touching Azure, this tool displays: Total plans and sites to be created SKU and pricing tier per plan Whether Managed Instance is configured Cost warning for PV4 pricing Resource group, region, and subscription details Nothing proceeds until the user explicitly confirms. 13. migrate_sites Purpose: Deploy everything to Azure App Service. This creates billable resources. Calls Invoke-SiteMigration.ps1, which: Sets Azure subscription context Creates/validates resource groups Creates App Service Plans (PV4 with IsCustomMode for Managed Instance) Creates Web Apps Configures .NET version, 32-bit mode, pipeline mode from the original IIS settings Sets up virtual directories and applications Disables basic authentication (FTP + SCM) for security Deploys ZIP packages via Azure REST API Output: MigrationResults.json with per-site Azure URLs, Resource IDs, and deployment status. The 6 Copilot Agents The MCP tools are orchestrated by a team of specialized Copilot agents — each responsible for a specific phase of the migration lifecycle. @iis-migrate — The Orchestrator The root agent that guides the entire migration. It: Tracks progress across all 5 phases using a todo list Delegates work to specialist subagents Gates between phases — asks before transitioning Enforces the Managed Instance constraint (PV4 + IsCustomMode) at every decision point Never skips the Phase 5 confirmation gate Usage: Open Copilot Chat and type @iis-migrate I want to migrate my IIS applications to Azure iis-discover — Discovery Specialist Handles Phase 1. Runs discover_iis_sites, presents a summary table of all sites with their readiness status, and asks whether to assess or skip to packaging. Returns readiness_results_path and per-site routing plans. iis-assess — Assessment Specialist Handles Phase 2. Runs assess_site_readiness for every site, and assess_source_code when AppCat results are available. Merges findings, highlights Managed Instance-relevant issues, and produces the adapter/install features lists that drive Phase 3. iis-recommend — Recommendation Specialist Handles Phase 3. Runs recommend_target for each site, then conditionally generates install.ps1 and ARM adapter templates. Presents all recommendations with confidence levels and reasoning, and allows you to edit generated artifacts. iis-deploy-plan — Deployment Planning Specialist Handles Phase 4. Collects Azure details, runs plan_deployment, package_site, and generate_migration_settings. Validates Managed Instance configuration, allows review and editing of MigrationSettings.json. Does not execute migration. iis-execute — Execution Specialist Handles Phase 5 only. Runs confirm_migration to present the final summary, then only proceeds with migrate_sites after receiving explicit "yes" confirmation. Reports results with Azure URLs and deployment status. The Managed Instance Provisioning Split: A Critical Concept One of the most important ideas Managed Instance introduces is the provisioning split — the division of OS dependencies into two categories that are configured through different mechanisms: ┌──────────────────────────────────────────────────────────────┐ │ MANAGED INSTANCE PROVISIONING SPLIT │ ├─────────────────────────────┬────────────────────────────────┤ │ ARM Template │ install.ps1 │ │ (Platform-Level) │ (OS-Level) │ ├─────────────────────────────┼────────────────────────────────┤ │ Registry Adapters │ COM/MSI Registration │ │ → Key Vault secrets │ → regsvr32, RegAsm, msiexec │ │ │ │ │ Storage Mounts │ SMTP Server Feature │ │ → Azure Files │ → Install-WindowsFeature │ │ → Local SSD │ │ │ → VNET private storage │ MSMQ │ │ │ → Message queue setup │ │ │ │ │ │ Crystal Reports Runtime │ │ │ → SAP MSI installer │ │ │ │ │ │ Custom Fonts │ │ │ → Copy to C:\Windows\Fonts │ └─────────────────────────────┴────────────────────────────────┘ The MCP server handles this split automatically: assess_source_code detects which dependencies fall into which category recommend_target reports both adapter_features and install_script_features generate_adapter_arm_template builds the ARM template for platform features generate_install_script builds the PowerShell startup script for OS features You don't need to remember which goes where — the system decides and generates the right artifacts. End-to-End Walkthrough: From Discovery to Running on Managed Instance Here's what a complete migration conversation looks like: You: "@iis-migrate I want to migrate my IIS applications to Azure" Phase 1 — Discovery: Agent runs discover_iis_sites, presents a table: Site Status Framework Source Code? HRPortal READY_WITH_ISSUES v4.8 Yes (.sln found) PayrollAPI READY v4.8 No IntranetCMS BLOCKED (>2GB) v4.7.2 No Phase 2 — Assessment: Agent runs assess_site_readiness for HRPortal — finds GACCheck and RegistryCheck failures. Runs assess_source_code using AppCat report — confirms COM interop, registry access, and SMTP usage. Phase 3 — Recommendation: Agent runs recommend_target: HRPortal → MI_AppService (high confidence) — COM, registry, SMTP dependencies PayrollAPI → AppService (high confidence) — no OS dependencies Generates install.ps1 for HRPortal (SMTP + COM sections). Generates ARM template with registry adapter (Key Vault-backed) for HRPortal. Phase 4 — Deployment Planning: Agent collects subscription/RG/region, validates PV4 availability. Packages both sites. Generates MigrationSettings.json with two plans: mi-plan-hrportal (PremiumV4, IsCustomMode=true) — HRPortal std-plan-payrollapi (PremiumV2) — PayrollAPI Phase 5 — Execution: Agent shows full summary with cost projection. You type "yes". Sites deploy. You get Azure URLs within minutes. Prerequisites & Setup Requirement Purpose Windows Server with IIS Source server for discovery and packaging PowerShell 5.1 Runs migration scripts (ships with Windows) Python 3.10+ MCP server runtime Administrator privileges Required for IIS discovery, packaging, and migration Azure subscription Target for deployment (execution phase only) Azure PowerShell (Az module) Deploy to Azure (execution phase only) Migration Scripts ZIP Microsoft's PowerShell migration scripts AppCat CLI Source code analysis (optional) FastMCP (mcp[cli]>=1.0.0) MCP server framework Data Flow & Artifacts Every phase produces JSON artifacts that chain into the next phase: Phase 1: discover_iis_sites ──→ ReadinessResults.json │ Phase 2: assess_site_readiness ◄──────┘ assess_source_code ───→ Assessment JSONs │ Phase 3: recommend_target ◄───────────┘ generate_install_script ──→ install.ps1 generate_adapter_arm ─────→ mi-adapters-template.json │ Phase 4: package_site ────────────→ PackageResults.json + site ZIPs generate_migration_settings → MigrationSettings.json │ Phase 5: confirm_migration ◄──────────┘ migrate_sites ───────────→ MigrationResults.json │ ▼ Apps live on Azure *.azurewebsites.net Each artifact is inspectable, editable, and auditable — providing a complete record of what was assessed, recommended, and deployed. Error Handling The MCP server classifies errors into actionable categories: Error Cause Resolution ELEVATION_REQUIRED Not running as Administrator Restart VS Code / terminal as Admin IIS_NOT_FOUND IIS or WebAdministration module missing Install IIS role + WebAdministration AZURE_NOT_AUTHENTICATED Not logged into Azure PowerShell Run Connect-AzAccount SCRIPT_NOT_FOUND Migration scripts path not configured Run configure_scripts_path SCRIPT_TIMEOUT PowerShell script exceeded time limit Check IIS server responsiveness OUTPUT_NOT_FOUND Expected JSON output wasn't created Verify script execution succeeded Conclusion The IIS Migration MCP Server turns what used to be a multi-week, expert-driven project into a guided conversation. It combines Microsoft's battle-tested migration PowerShell scripts with AI orchestration that understands the nuances of Managed Instance on App Service — the provisioning split, the PV4 constraint, the adapter configurations, and the OS-level customizations. Whether you're migrating 1 site or 10, agentic migration reduces risk, eliminates guesswork, and produces auditable artifacts at every step. The human stays in control; the AI handles the complexity. Get started: Download the migration scripts, set up the MCP server, and ask @iis-migrate to discover your IIS sites. The agents will take it from there. This project is compatible with any MCP-enabled client: VS Code GitHub Copilot, Claude Desktop, Cursor, and more. The intelligence travels with the server, not the client.447Views0likes0CommentsExplaining what GitHub Copilot Modernization can (and cannot do)
In the last post, we looked at the workflow: assess, plan, execute. You get reports you can review and the agent makes changes you can inspect. If you don’t know, GitHub Copilot Modernization is the new agentic tool that supports you to in modernizing older applications. Could it support you with that old 4.8 Framework app, even that forgotten VB.NET script? You're probably not modernizing one small app. It is probably a handful of projects, each with its own stack of blockers. Different frameworks, different databases, different dependencies frozen in time because nobody wants to touch them. GitHub Copilot modernization handles two big categories: upgrading .NET projects to newer versions and migrating .NET apps to Azure. But what does that look like? Upgrading .NET Projects Let’s say, you've got an ASP.NET app running on .NET Framework 4.8 or it's a web API stuck on .NET Core 3.1. Unfortunately, getting it to .NET 9 or 10 isn't just updating a target framework property. Here's what the upgrade workflow handles in Visual Studio: Assessment first. - The agent examines your project structure, dependencies, and code patterns. It generates an Assessment Report, which shows both the app information, to create the plan, and shows what it needs to do and update. Then planning. - Once you approve the assessment, it moves to planning. Here you get upgrade strategies, refactoring approaches, dependency upgrade paths, and risk mitigations documented in a plan.md file at .appmod/.migration, you can check and edit that Markdown before moving forward or ask in the Copilot Chat window to change it. # .NET 10.0 Upgrade Plan ## Execution Steps Execute steps below sequentially one by one in the order they are listed. 1. Validate that a .NET 10.0 SDK required for this upgrade is installed on the machine and if not, help to get it installed. 2. Ensure that the SDK version specified in global.json files is compatible with the .NET 10.0 upgrade. 3. Upgrade src\eShopLite.StoreFx\eShopLite.StoreFx.csproj ## Settings This section contains settings and data used by execution steps. ### Excluded projects No projects are excluded from this upgrade. ### Aggregate NuGet packages modifications across all projects NuGet packages used across all selected projects or their dependencies that need version update in projects that reference them Then execution. - After you approve the plan, and the agent breaks it into discrete tasks in a tasks.md file. Each task gets validation criteria. As it works, it updates the file with checkboxes and completion percentages so you can track progress. It makes code changes, verifies builds, runs tests. If it hits a problem, it tries to identify the cause and apply a fix. Go to the GitHub Copilot Chat window and type: The plan and progress tracker look good to me. Go ahead with the migration. It usually creates Git commits for each portion so you can review what changed or roll back if you need to. In case you don’t have a need for the Git commits for the change, you can ask the agent at the start to not commit anything. The agent primarily focuses on ASP.NET, ASP.NET Core, Blazor, Razor Pages, MVC, and Web API. It can also handle Azure Functions, WPF, Windows Forms, console apps, class libraries, and test projects. What It Handles Well (and What It Doesn't) The agent is good at code-level transformations: updating TargetFramework in .csproj files, upgrading NuGet packages, replacing deprecated APIs with their modern equivalents, fixing breaking changes like removed BinaryFormatter methods, running builds, and validating test suites. It can handle repetitive work across multiple projects in a solution without you needing to track every dependency manually. It's also solid at applying predefined Azure migration patterns, swapping plaintext credentials for managed identity, replacing file I/O with Azure Blob Storage calls, moving authentication from on-prem Active Directory to Microsoft Entra ID. These are structured transformations with clear before-and-after code patterns. But here's where you may need to pay closer attention: Language and framework coverage: It works with C# projects mainly. If your codebase includes complex Entity Framework migrations that rely on hand-tuned database scripts, the agent won't rewrite those for you. It also won't handle third-party UI framework patterns that don't map cleanly to ASP.NET Core conventions that have breaking changes between .NET Framework and later .NET versions. Web Forms migration is underway. Configuration and infrastructure: The agent doesn't migrate IIS-specific web.config settings that don't have direct equivalents in Kestrel or ASP.NET Core. It won't automatically set up a CI/CD pipeline or any modernization features; for that, you need to implement it with Copilot’s help. If you've got frontend frameworks bundled with ASP.NET (like an older Angular app served through MVC), you'll need to separate and upgrade that layer yourself. Learning and memory: The agent uses your code as context during the session, and if you correct a fix or update the plan, it tries to apply that learning within the same session. But those corrections don't persist across future upgrades. You can encode internal standards using custom skills, but that requires deliberate setup. Offline and deployment: There's no offline mode. The agent needs connectivity to run. And while it can help prepare your app for Azure deployment, it doesn't manage the actual infrastructure provisioning or ongoing operations, that's still on you. Guarantees: The suggestions aren't guaranteed to follow best practices. The agent won't always pick the best migration path. It won't catch every edge case. You're reviewing the work; pay attention to the results before putting it into production. What it does handle: the tedious parts. Reading dependency graphs. Finding all the places a deprecated API is used. Updating project files. Writing boilerplate for managed identity. Fixing compilation errors that follow a predictable pattern. Where to Start If you've been staring at a modernization backlog, pick one project. See what it comes up with! You don't have to commit to upgrading your entire portfolio. Try it on one project and see if it saves you time. Modernization at scale still happens application by application, repo by repo, and decision by decision. GitHub Copilot modernization just makes each one a little less painful. Experiment with it!1.1KViews0likes0CommentsWhy Does Azure App Service Return HTTP 404?
When an application deployed to Azure App Service suddenly starts returning HTTP 404 – Not Found, it can be confusing —especially when: The deployment completed successfully The App Service shows as Running No obvious errors appear in the portal This behaviour is more common than it appears and is often linked to routing, configuration, or platform : In this article, I’ll walk through real-world reasons why Azure App Service can return HTTP 404 errors, based on issues . The goal is to help you systematically isolate the root cause—whether it’s application-level, configuration-related, or platform-specific. What Does HTTP 404 Mean in Azure App Service? An HTTP 404 response from Azure App Service means: The incoming request successfully reached Azure App Service, but neither the platform nor the application could locate the requested resource. This distinction is important. Unlike connectivity or DNS issues, a 404 confirms that: DNS resolution worked The request hit the App Service front end The failure happened after request routing Incorrect Application URL or Route This is the most common cause of 404 errors. Typical scenarios Accessing the root URL (https://<app>.azurewebsites.net) for a Web API that exposes only API routes Missing route prefixes such as /api , /v1controller/action name segments Case sensitivity mismatches on Linux App Service Example https://myapp.azurewebsites.net Returns 404, but: https://myapp.azurewebsites.net/weatherforecast Works as expected. ✅ Tip: Always validate your routing locally and confirm the exact same path is being accessed in Azure. Application Appears Running, but Startup Failed Partially It is possible for an App Service to show Running even when the application failed to initialize fully. Common causes Missing or incorrect environment variables Invalid connection strings Exceptions thrown during Program.cs / Startup.cs Dependency initialization failures at startup In such scenarios, the app may start the host process but fail to register routes—resulting in 404 responses instead of 500 errors. ✅ Where to check Application logs Deployment logs Kudu → LogFiles Static Files Not Found or Not Being Served For applications hosting static content (HTML, JavaScript, images, JSON files), a 404 can occur even when files exist. Common reasons Files not deployed to the expected directory (wwor root, /home/site/wwwroot) Missing or unsupported MIME type configuration (commonly seen with .json) Static file middleware not enabled in ASP.NET Core applications ✅ Quick validation: Deploy a simple test.html to wwwroot and try accessing it directly. Windows vs Linux App Service Differences Behaviour can differ significantly between Windows App Service and Linux App Service. Common pitfalls on Linux Case-sensitive file paths (Index.html ≠ index.html) Missing or incorrect startup command Differences in request routing handled by Nginx ✅ Tip: If the app works on Windows App Service but fails on Linux, always recheck file casing and startup configuration first. Custom Domain and Networking Configuration Issues In some cases, requests reach the App Service but fail due to domain or network constraints. Possible causes Incorrect custom domain binding ✅ Isolation step: Always test using the default *.azurewebsites.net specific issues the issue is domain-specific. 6. Health Checks or Monitoring Probes Targeting Invalid Paths Seeing periodic 404 entries in logs—every few minutes—is often a sign of misconfigured probes. Typical scenarios App Service Health Check configured with a non-existent endpoint External monitoring tools probing /health or paths that do no exist ✅ Fix: Ensure the health check path maps to a valid endpoint implemented by the application. 7.Missing or Corrupted Deployment Artifacts Even when deployments report success, application files may not be where the runtime expects them. Commonly observed with Zip deployments WEBSITE_RUN_FROM_PACKAGE misconfigurations Partial or interrupted deployments ✅ Verify using Kudu: Browse /home/site/wwwroot and check files are present. Quick Troubleshooting Checklist If your Azure App Service is returning HTTP 404: Verify the exact URL and route Test hostingstart.html or a static file (for example, /hostingstart.html) Review startup and application logs Inspect deployed artifacts via Kudu Validate Windows vs Linux behaviour differences Review networking, authentication, and health check settings 8. Application Gateway infront of App Service If you have Application gateway infront of app service , please check the re-write rules so that the request is being sent to correct path. Final Thoughts HTTP 404 errors on Azure App Service are rarely random. In most cases, they point to: Routing mismatches Startup or configuration failures Platform-specific behavior differences By breaking the investigation into platform → configuration → application, you can systematically narrow down the root cause and resolve the issue. Happy debugging 🚀303Views1like0CommentsApp Service Easy MCP: Add AI Agent Capabilities to Your Existing Apps with Zero Code Changes
The age of AI agents is here. Tools like GitHub Copilot, Claude, and other AI assistants are no longer just answering questions—they're taking actions, calling APIs, and automating complex workflows. But how do you make your existing applications and APIs accessible to these intelligent agents? At Microsoft Ignite, I teamed up to present session BRK116: Apps, agents, and MCP is the AI innovation recipe, where I demonstrated how you can add agentic capabilities to your existing applications with little to no code changes. Today, I'm excited to share a concrete example of that vision: Easy MCP—a way to expose any REST API to AI agents with absolutely zero code changes to your existing apps. The Challenge: Bridging REST APIs and AI Agents Most organizations have invested years building REST APIs that power their applications. These APIs represent critical business logic, data access patterns, and integrations. But AI agents speak a different language—they use protocols like Model Context Protocol (MCP) to discover and invoke tools. The traditional approach would require you to: Learn the MCP SDK Write new MCP server code Manually map each API endpoint to an MCP tool Deploy and maintain additional infrastructure What if you could skip all of that? Introducing Easy MCP (a proof of concept not associated with the App Service platform) Easy MCP is an OpenAPI-to-MCP translation layer that automatically generates MCP tools from your existing REST APIs. If your API has an OpenAPI (Swagger) specification—which most modern APIs do—you can make it accessible to AI agents in minutes. This means that if you have existing apps with OpenAPI specifications already running on App Service, or really any hosting platform, this tool makes enabling MCP seamless. How It Works Point the gateway at your API's base URL Detect your OpenAPI specification automatically Connect and the gateway generates MCP tools for every endpoint Use the MCP endpoint URL with any MCP-compatible AI client That's it. No code changes. No SDK integration. No manual tool definitions. See It in Action Let's say you have a Todo API running on Azure App Service at `https://my-todo-app.azurewebsites.net`. In just a few clicks: Open the Easy MCP web UI Enter your API URL Click "Detect" to find your OpenAPI spec Click "Connect" Now configure your AI client (like VS Code with GitHub Copilot) to use the gateway's MCP endpoint: { "servers": { "my-api": { "type": "http", "url": "https://my-gateway.azurewebsites.net/mcp" } } } Instantly, your AI assistant can: "What's on my todo list?" "Add 'Review PR #123' to my todos with high priority" "Mark all tasks as complete" All powered by your existing REST API, with zero modifications. The Bigger Picture: Modernization Without Rewrites This approach aligns perfectly with a broader modernization strategy we're enabling on Azure App Service. App Service Managed Instance: Move and Modernize Legacy Apps For organizations with legacy applications—whether they're running on older Windows frameworks, custom configurations, or traditional hosting environments—Azure App Service Managed Instance provides a path to the cloud with minimal friction. You can migrate these applications to a fully managed platform without rewriting code. Easy MCP: Add AI Capabilities Post-Migration Once your legacy applications are running on App Service, Easy MCP becomes the next step in your modernization journey. That 10-year-old internal API? It can now be accessed by AI agents. That legacy inventory system? AI assistants can query and update it. No code changes needed. The modernization path: Migrate legacy apps to App Service with Managed Instance (no code changes) Expose APIs to AI agents with Easy MCP Gateway (no code changes) Empower your organization with AI-assisted workflows Deploy It Yourself Easy MCP is open source and ready to deploy. If you already have an existing API to use with this tool, go for it. If you need an app to test with, check out this sample. Make sure you complete the "Add OpenAPI functionality to your web app" step. You don't need to go beyond that. GitHub Repository: seligj95/app-service-easy-mcp Deploy to Azure in minutes with Azure Developer CLI: azd auth login azd init azd up Or run it locally for testing: npm install npm run dev # Open http://localhost:3000 What's Next: Native App Service Integration Here's where it gets really exciting. We're exploring ways to build this capability directly into the Azure App Service platform so you won't have to deploy a second app or additional resources to get this capability. Azure API Management recently released a feature with functionality to expose a REST API, including an API on App Service, as an MCP server, which I highly recommend that you check out if you're familiar with Azure API Management. But in this case, imagine a future where adding AI agent capabilities to your App Service apps is as simple as flipping a switch in the Azure Portal—no gateway or API Management deployment required, no additional infrastructure or services to manage, and built-in security, monitoring, scaling, etc.—all of the features you're already using and are familiar with on App Service. Stay tuned for updates as we continue to make Azure App Service the best platform for AI-powered applications. And please share your feedback on Easy MCP—we want to hear how you're using it and what features you'd like to see next as we consider this feature for native integration.1.1KViews1like1CommentEvent-Driven IaC Operations with Azure SRE Agent: Terraform Drift Detection via HTTP Triggers
What Happens After terraform plan Finds Drift? If your team is like most, the answer looks something like this: A nightly terraform plan runs and finds 3 drifted resources A notification lands in Slack or Teams Someone files a ticket During the next sprint, an engineer opens 4 browser tabs — Terraform state, Azure Portal, Activity Log, Application Insights — and spends 30 minutes piecing together what happened They discover the drift was caused by an on-call engineer who scaled up the App Service during a latency incident at 2 AM They revert the drift with terraform apply The app goes down because they just scaled it back down while the bug that caused the incident is still deployed Step 7 is the one nobody talks about. Drift detection tooling has gotten remarkably good — scheduled plans, speculative runs, drift alerts — but the output is always the same: a list of differences. What changed. Not why. Not whether it's safe to fix. The gap isn't detection. It's everything that happens after detection. HTTP Triggers in Azure SRE Agent close that gap. They turn the structured output that drift detection already produces — webhook payloads, plan summaries, run notifications — into the starting point of an autonomous investigation. Detection feeds the agent. The agent does the rest: correlates with incidents, reads source code, classifies severity, recommends context-aware remediation, notifies the team, and even ships a fix. Here's what that looks like end to end. What you'll see in this blog: An agent that classifies drift as Benign, Risky, or Critical — not just "changed" Incident correlation that links a SKU change to a latency spike in Application Insights A remediation recommendation that says "Do NOT revert" — and why reverting would cause an outage A Teams notification with the full investigation summary An agent that reviews its own performance, finds gaps, and improves its own skill file A pull request the agent created on its own to fix the root cause The Pipeline: Detection to Resolution in One Webhook The architecture is straightforward. Terraform Cloud (or any drift detection tool) sends a webhook when it finds drift. An Azure Logic App adds authentication. The SRE Agent's HTTP Trigger receives it and starts an autonomous investigation. The end-to-end pipeline: Terraform Cloud detects drift and sends a webhook. The Logic App adds Azure AD authentication via Managed Identity. The SRE Agent's HTTP Trigger fires and the agent autonomously investigates across 7 dimensions. Setting Up the Pipeline Step 1: Deploy the Infrastructure with Terraform We start with a simple Azure App Service running a Node.js application, deployed via Terraform. The Terraform configuration defines the desired state: App Service Plan: B1 (Basic) — single vCPU, ~$13/mo App Service: Node 20-lts with TLS 1.2 Tags: environment: demo, managed_by: terraform, project: sre-agent-iac-blog resource "azurerm_service_plan" "demo" { name = "iacdemo-plan" resource_group_name = azurerm_resource_group.demo.name location = azurerm_resource_group.demo.location os_type = "Linux" sku_name = "B1" } A Logic App is also deployed to act as the authentication bridge between Terraform Cloud webhooks and the SRE Agent's HTTP Trigger endpoint, using Managed Identity to acquire Azure AD tokens. Learn more about HTTP Triggers here. Step 2: Create the Drift Analysis Skill Skills are domain knowledge files that teach the agent how to approach a problem. We create a terraform-drift-analysis skill with an 8-step workflow: Identify Scope — Which resource group and resources to check Detect Drift — Compare Terraform config against Azure reality Correlate with Incidents — Check Activity Log and App Insights Classify Severity — Benign, Risky, or Critical Investigate Root Cause — Read source code from the connected repository Generate Drift Report — Structured summary with severity-coded table Recommend Smart Remediation — Context-aware: don't blindly revert Notify Team — Post findings to Microsoft Teams The key insight in the skill: "NEVER revert critical drift that is actively mitigating an incident." This teaches the agent to think like an experienced SRE, not just a diff tool. Step 3: Create the HTTP Trigger In the SRE Agent UI, we create an HTTP Trigger named tfc-drift-handler with a 7-step agent prompt: A Terraform Cloud run has completed and detected infrastructure drift. Workspace: {payload.workspace_name} Organization: {payload.organization_name} Run ID: {payload.run_id} Run Message: {payload.run_message} STEP 1 — DETECT DRIFT: Compare Terraform configuration against actual Azure state... STEP 2 — CORRELATE WITH INCIDENTS: Check Azure Activity Log and App Insights... STEP 3 — CLASSIFY SEVERITY: Rate each drift item as Benign, Risky, or Critical... STEP 4 — INVESTIGATE ROOT CAUSE: Read the application source code... STEP 5 — GENERATE DRIFT REPORT: Produce a structured summary... STEP 6 — RECOMMEND SMART REMEDIATION: Context-aware recommendations... STEP 7 — NOTIFY TEAM: Post a summary to Microsoft Teams... Step 4: Connect GitHub and Teams We connect two integrations in the SRE Agent Connectors settings: Code Repository: GitHub — so the agent can read application source code during investigations Notification: Microsoft Teams — so the agent can post drift reports to the team channel The Incident Story Act 1: The Latency Bug Our demo app has a subtle but devastating bug. The /api/data endpoint calls processLargeDatasetSync() — a function that sorts an array on every iteration, creating an O(n² log n) blocking operation. On a B1 App Service Plan (single vCPU), this blocks the Node.js event loop entirely. Under load, response times spike from milliseconds to 25-58 seconds, with 502 Bad Gateway errors from the Azure load balancer. Act 2: The On-Call Response An on-call engineer sees the latency alerts and responds — not through Terraform, but directly through the Azure Portal and CLI. They: Add diagnostic tags — manual_update=True, changed_by=portal_user (benign) Downgrade TLS from 1.2 to 1.0 while troubleshooting (risky — security regression) Scale the App Service Plan from B1 to S1 to throw more compute at the problem (critical — cost increase from ~$13/mo to ~$73/mo) The incident is partially mitigated — S1 has more compute, so latency drops from catastrophic to merely bad. Everyone goes back to sleep. Nobody updates Terraform. Act 3: The Drift Check Fires The next morning, a nightly speculative Terraform plan runs and detects 3 drifted attributes. The notification webhook fires, flowing through the Logic App auth bridge to the SRE Agent HTTP Trigger. The agent wakes up and begins its investigation. What the Agent Found Layer 1: Drift Detection The agent compares Terraform configuration against Azure reality and produces a severity-classified drift report: Three drift items detected: Critical: App Service Plan SKU changed from B1 (~$13/mo) to S1 (~$73/mo) — a +462% cost increase Risky: Minimum TLS version downgraded from 1.2 to 1.0 — a security regression vulnerable to BEAST and POODLE attacks Benign: Additional tags (changed_by: portal_user, manual_update: True) — cosmetic, no functional impact Layer 2: Incident Correlation Here's where the agent goes beyond simple drift detection. It queries Application Insights and discovers a performance incident correlated with the SKU change: Key findings from the incident correlation: 97.6% of requests (40 of 41) were impacted by high latency The /api/data endpoint does not exist in the repository source code — the deployed application has diverged from the codebase The endpoint likely contains a blocking synchronous pattern — Node.js runs on a single event loop, and any synchronous blocking call would explain 26-58s response times The SKU scale-up from B1→S1 was an attempt to mitigate latency by adding more compute, but scaling cannot fix application-level blocking code on a single-threaded Node.js server Layer 3: Smart Remediation This is the insight that separates an autonomous agent from a reporting tool. Instead of blindly recommending "revert all drift," the agent produces context-aware remediation recommendations: The agent's remediation logic: Tags (Benign) → Safe to revert anytime via terraform apply -target TLS 1.0 (Risky) → Revert immediately — the TLS downgrade is a security risk unrelated to the incident SKU S1 (Critical) → DO NOT revert until the /api/data performance root cause is fixed This is the logic an experienced SRE would apply. Blindly running terraform apply to revert all drift would scale the app back down to B1 while the blocking code is still deployed — turning a mitigated incident into an active outage. Layer 4: Investigation Summary The agent produces a complete summary tying everything together: Key findings in the summary: Actor: surivineela@microsoft.com made all changes via Azure Portal at ~23:19 UTC Performance incident: /api/data averaging 25-57s latency, affecting 97.6% of requests Code-infrastructure mismatch: /api/data exists in production but not in the repository source code Root cause: SKU scale-up was emergency incident response, not unauthorized drift Layer 5: Teams Notification The agent posts a structured drift report to the team's Microsoft Teams channel: The on-call engineer opens Teams in the morning and sees everything they need: what drifted, why it drifted, and exactly what to do about it — without logging into any dashboard. The Payoff: A Self-Improving Agent Here's where the demo surprised us. After completing the investigation, the agent did two things we didn't explicitly ask for. The Agent Improved Its Own Skill The agent performed an Execution Review — analyzing what worked and what didn't during its investigation — and found 5 gaps in its own terraform-drift-analysis.md skill file: What worked well: Drift detection via az CLI comparison against Terraform HCL was straightforward Activity Log correlation identified the actor and timing Application Insights telemetry revealed the performance incident driving the SKU change Gaps it found and fixed: No incident correlation guidance — the skill didn't instruct checking App Insights No code-infrastructure mismatch detection — no guidance to verify deployed code matches the repository No smart remediation logic — didn't warn against reverting critical drift during active incidents Report template missing incident correlation column No Activity Log integration guidance — didn't instruct checking who made changes and when The agent then edited its own skill file to incorporate these learnings. Next time it runs a drift analysis, it will include incident correlation, code-infra mismatch checks, and smart remediation logic by default. This is a learning loop — every investigation makes the agent better at future investigations. The Agent Created a PR Without being asked, the agent identified the root cause code issue and proactively created a pull request to fix it: The PR includes: App safety fixes: Adding MAX_DELAY_MS and SERVER_TIMEOUT_MS constants to prevent unbounded latency Skill improvements: Incorporating incident correlation, code-infra mismatch detection, and smart remediation logic From a single webhook: drift detected → incident correlated → root cause found → team notified → skill improved → fix shipped. Key Takeaways Drift detection is not enough. Knowing that B1 changed to S1 is table stakes. Knowing it changed because of a latency incident, and that reverting it would cause an outage — that's the insight that matters. Context-aware remediation prevents outages. Blindly running terraform apply after drift would have scaled the app back to B1 while blocking code was still deployed. The agent's "DO NOT revert SKU" recommendation is the difference between fixing drift and causing a P1. Skills create a learning loop. The agent's self-review and skill improvement means every investigation makes the next one better — without human intervention. HTTP Triggers connect any platform. The auth bridge pattern (Logic App + Managed Identity) works for Terraform Cloud, but the same architecture applies to any webhook source: GitHub Actions, Jenkins, Datadog, PagerDuty, custom internal tools. The agent acts, not just reports. From a single webhook: drift detected, incident correlated, root cause identified, team notified via Teams, skill improved, and PR created. End-to-end in one autonomous session. Getting Started HTTP Triggers are available now in Azure SRE Agent: Create a Skill — Teach the agent your operational runbook (in this case, drift analysis with severity classification and smart remediation) Create an HTTP Trigger — Define your agent prompt with {payload.X} placeholders and connect it to a skill Set Up an Auth Bridge — Deploy a Logic App with Managed Identity to handle Azure AD token acquisition Connect Your Source — Point Terraform Cloud (or any webhook-capable platform) at the Logic App URL Connect GitHub + Teams — Give the agent access to source code and team notifications Within minutes, you'll have an autonomous pipeline that turns infrastructure drift events into fully contextualized investigations — with incident correlation, root cause analysis, and smart remediation recommendations. The full implementation guide, Terraform files, skill definitions, and demo scripts are available in this repository.735Views0likes0CommentsUsing an AI Agent to Troubleshoot and Fix Azure Function App Issues
TOC Preparation Troubleshooting Workflow Conclusion Preparation Topic: Required tools AI agent: for example, Copilot CLI / OpenCode / Hermes / OpenClaw, etc. In this example, we use Copilot CLI. Model access: for example, Anthropic Claude Opus. Relevant skills: this example does not use skills, but using relevant skills can speed up troubleshooting. Topic: Compliant with your organization Enterprise-level projects are sensitive, so you must confirm with the appropriate stakeholders before using them. Enterprise environments may also have strict standards for AI agent usage. Topic: Network limitations If the process involves restarting the Function App container or restarting related settings, communication between the user and the agent may be interrupted, and you will need to use /resume. If the agent needs internet access for investigation, the app must have outbound connectivity. If the Kudu container cannot be used because of network issues, this type of investigation cannot be carried out. Topic: Permission limitations If you are using Azure blessed images, according to the official documentation, the containers use the fixed password Docker!. However, if you are using a custom container, you will need to provide an additional login method. For resources the agent does not already have permission to investigate, you will need to enable SAMI and assign the appropriate RBAC roles. Troubleshooting Workflow Let’s use a classic case where an HTTP trigger cannot be tested from the Azure Portal. As you can see, when clicking Test/Run in the Azure Portal, an error message appears. At the same time, however, the home page does not show any abnormal status. At this point, we first obtain the Function App’s SAMI and assign it the Owner role for the entire resource group. This is only for demonstration purposes. In practice, you should follow the principle of least privilege and scope permissions down to only the specific resources and operations that are actually required. Next, go to the Kudu container, which is the always-on maintenance container dedicated to the app. Install and enable Copilot CLI. Then we can describe the problem we are encountering. After the agent processes the issue and interacts with you further, it can generate a reasonable investigation report. In this example, it appears that the Function App’s Storage Account access key had been rotated previously, but the Function App had not updated the corresponding environment variable. Once we understand the issue, we could perform the follow-up actions ourselves. However, to demonstrate the agent’s capabilities, you can also allow it to fix the problem directly, provided that you have granted the corresponding permissions through SAMI. During the process, the container restart will disconnect the session, so you will need to return to the Kudu container and resume the previous session so it can continue. Finally, it will inform you that the issue has been fixed, and then you can validate the result. This is the validation result, and it looks like the repair was successful. Conclusion After each repair, we can even extract the experience from that case into a skill and store it in a Storage Account for future reuse. In this way, we can not only reduce the agent’s initial investigation time for similar issues, but also save tokens. This makes both time and cost management more efficient.424Views3likes0CommentsBuild Multi-Agent AI Apps on Azure App Service with Microsoft Agent Framework 1.0
Part 1 of 3 — Multi-Agent AI on Azure App Service This is part 1 of a 3 part series on deploying and working with multi-agent AI on Azure App Service. Follow allong to learn how to deploy, manage, observe, and secure your agents on Azure App Service. A couple of months ago, we published a three-part series showing how to build multi-agent AI systems on Azure App Service using preview packages from the Microsoft Agent Framework (MAF) (formerly AutoGen / Semantic Kernel Agents). The series walked through async processing, the request-reply pattern, and client-side multi-agent orchestration — all running on App Service. Since then, Microsoft Agent Framework has reached 1.0 GA — unifying AutoGen and Semantic Kernel into a single, production-ready agent platform. This post is a fresh start with the GA bits. We'll rebuild our travel-planner sample on the stable API surface, call out the breaking changes from preview, and get you up and running fast. All of the code is in the companion repo: seligj95/app-service-multi-agent-maf-otel. What Changed in MAF 1.0 GA The 1.0 release is more than a version bump. Here's what moved: Unified platform. AutoGen and Semantic Kernel agent capabilities have converged into Microsoft.Agents.AI . One package, one API surface. Stable APIs with long-term support. The 1.0 contract is now locked for servicing. No more preview churn. Breaking change — Instructions on options removed. In preview, you set instructions through ChatClientAgentOptions.Instructions . In GA, pass them directly to the ChatClientAgent constructor. Breaking change — RunAsync parameter rename. The thread parameter is now session (type AgentSession ). If you were using named arguments, this is a compile error. Microsoft.Extensions.AI upgraded. The framework moved from the 9.x preview of Microsoft.Extensions.AI to the stable 10.4.1 release. OpenTelemetry integration built in. The builder pipeline now includes UseOpenTelemetry() out of the box — more on that in Blog 2. Our project references reflect the GA stack: <PackageReference Include="Microsoft.Agents.AI" Version="1.0.0" /> <PackageReference Include="Microsoft.Extensions.AI" Version="10.4.1" /> <PackageReference Include="Azure.AI.OpenAI" Version="2.1.0" /> Why Azure App Service for AI Agents? If you're building with Microsoft Agent Framework, you need somewhere to run your agents. You could reach for Kubernetes, containers, or serverless — but for most agent workloads, Azure App Service is the sweet spot. Here's why: No infrastructure management — App Service is fully managed. No clusters to configure, no container orchestration to learn. Deploy your .NET or Python agent code and it just runs. Always On — Agent workflows can take minutes. App Service's Always On feature (on Premium tiers) ensures your background workers never go cold, so agents are ready to process requests instantly. WebJobs for background processing — Long-running agent workflows don't belong in HTTP request handlers. App Service's built-in WebJob support gives you a dedicated background worker that shares the same deployment, configuration, and managed identity — no separate compute resource needed. Managed Identity everywhere — Zero secrets in your code. App Service's system-assigned managed identity authenticates to Azure OpenAI, Service Bus, Cosmos DB, and Application Insights automatically. No connection strings, no API keys, no rotation headaches. Built-in observability — Native integration with Application Insights and OpenTelemetry means you can see exactly what your agents are doing in production (more on this in Part 2). Enterprise-ready — VNet integration, deployment slots for safe rollouts, custom domains, auto-scaling rules, and built-in authentication. All the things you'll need when your agent POC becomes a production service. Cost-effective — A single P0v4 instance (~$75/month) hosts both your API and WebJob worker. Compare that to running separate container apps or a Kubernetes cluster for the same workload. The bottom line: App Service lets you focus on building your agents, not managing infrastructure. And since MAF supports both .NET and Python — both first-class citizens on App Service — you're covered regardless of your language preference. Architecture Overview The sample is a travel planner that coordinates six specialized agents to build a personalized trip itinerary. Users fill out a form (destination, dates, budget, interests), and the system returns a comprehensive travel plan complete with weather forecasts, currency advice, a day-by-day itinerary, and a budget breakdown. The Six Agents Currency Converter — calls the Frankfurter API for real-time exchange rates Weather Advisor — calls the National Weather Service API for forecasts and packing tips Local Knowledge Expert — cultural insights, customs, and hidden gems Itinerary Planner — day-by-day scheduling with timing and costs Budget Optimizer — allocates spend across categories and suggests savings Coordinator — assembles everything into a polished final plan Four-Phase Workflow Phase Agents Execution 1 — Parallel Gathering Currency, Weather, Local Knowledge Task.WhenAll 2 — Itinerary Itinerary Planner Sequential (uses Phase 1 context) 3 — Budget Budget Optimizer Sequential (uses Phase 2 output) 4 — Assembly Coordinator Final synthesis Infrastructure Azure App Service (P0v4) — hosts the API and a continuous WebJob for background processing Azure Service Bus — decouples the API from heavy AI work (async request-reply) Azure Cosmos DB — stores task state, results, and per-agent chat histories (24-hour TTL) Azure OpenAI (GPT-4o) — powers all agent LLM calls Application Insights + Log Analytics — monitoring and diagnostics ChatClientAgent Deep Dive At the core of every agent is ChatClientAgent from Microsoft.Agents.AI . It wraps an IChatClient (from Microsoft.Extensions.AI ) with instructions, a name, a description, and optionally a set of tools. This is client-side orchestration — you control the chat history, lifecycle, and execution order. No server-side Foundry agent resources are created. Here's the BaseAgent pattern used by all six agents in the sample: // BaseAgent.cs — constructor for agents with tools Agent = new ChatClientAgent( chatClient, instructions: Instructions, name: AgentName, description: Description, tools: chatOptions.Tools?.ToList()) .AsBuilder() .UseOpenTelemetry(sourceName: AgentName) .Build(); Notice the builder pipeline: .AsBuilder().UseOpenTelemetry(...).Build() . This opts every agent into the framework's built-in OpenTelemetry instrumentation with a single line. We'll explore what that telemetry looks like in Blog 2. Invoking an agent is equally straightforward: // BaseAgent.cs — InvokeAsync public async Task<ChatMessage> InvokeAsync( IList<ChatMessage> chatHistory, CancellationToken cancellationToken = default) { var response = await Agent.RunAsync( chatHistory, session: null, options: null, cancellationToken); return response.Messages.LastOrDefault() ?? new ChatMessage(ChatRole.Assistant, "No response generated."); } Key things to note: session: null — this is the renamed parameter (was thread in preview). We pass null because we manage chat history ourselves. The agent receives the full chatHistory list, so context accumulates across turns. Simple agents (Local Knowledge, Itinerary Planner, Budget Optimizer, Coordinator) use the tool-less constructor; agents that call external APIs (Currency, Weather) use the constructor that accepts ChatOptions with tools. Tool Integration Two of our agents — Weather Advisor and Currency Converter — call real external APIs through the MAF tool-calling pipeline. Tools are registered using AIFunctionFactory.Create() from Microsoft.Extensions.AI . Here's how the WeatherAdvisorAgent wires up its tool: // WeatherAdvisorAgent.cs private static ChatOptions CreateChatOptions( IWeatherService weatherService, ILogger logger) { var chatOptions = new ChatOptions { Tools = new List<AITool> { AIFunctionFactory.Create( GetWeatherForecastFunction(weatherService, logger)) } }; return chatOptions; } GetWeatherForecastFunction returns a Func<double, double, int, Task<string>> that the model can call with latitude, longitude, and number of days. Under the hood, it hits the National Weather Service API and returns a formatted forecast string. The Currency Converter follows the same pattern with the Frankfurter API. This is one of the nicest parts of the GA API: you write a plain C# method, wrap it with AIFunctionFactory.Create() , and the framework handles the JSON schema generation, function-call parsing, and response routing automatically. Multi-Phase Workflow Orchestration The TravelPlanningWorkflow class coordinates all six agents. The key insight is that the orchestration is just C# code — no YAML, no graph DSL, no special runtime. You decide when agents run, what context they receive, and how results flow between phases. // Phase 1: Parallel Information Gathering var gatheringTasks = new[] { GatherCurrencyInfoAsync(request, state, progress, cancellationToken), GatherWeatherInfoAsync(request, state, progress, cancellationToken), GatherLocalKnowledgeAsync(request, state, progress, cancellationToken) }; await Task.WhenAll(gatheringTasks); After Phase 1 completes, results are stored in a WorkflowState object — a simple dictionary-backed container that holds per-agent chat histories and contextual data: // WorkflowState.cs public Dictionary<string, object> Context { get; set; } = new(); public Dictionary<string, List<ChatMessage>> AgentChatHistories { get; set; } = new(); Phases 2–4 run sequentially, each pulling context from the previous phase. For example, the Itinerary Planner receives weather and local knowledge gathered in Phase 1: var localKnowledge = state.GetFromContext<string>("LocalKnowledge") ?? ""; var weatherAdvice = state.GetFromContext<string>("WeatherAdvice") ?? ""; var itineraryChatHistory = state.GetChatHistory("ItineraryPlanner"); itineraryChatHistory.Add(new ChatMessage(ChatRole.User, $"Create a detailed {days}-day itinerary for {request.Destination}..." + $"\n\nWEATHER INFORMATION:\n{weatherAdvice}" + $"\n\nLOCAL KNOWLEDGE & TIPS:\n{localKnowledge}")); var itineraryResponse = await _itineraryAgent.InvokeAsync( itineraryChatHistory, cancellationToken); This pattern — parallel fan-out followed by sequential context enrichment — is simple, testable, and easy to extend. Need a seventh agent? Add it to the appropriate phase and wire it into WorkflowState . Async Request-Reply Pattern A multi-agent workflow with six LLM calls (some with tool invocations) can easily run 30–60 seconds. That's well beyond typical HTTP timeout expectations and not a great user experience for a synchronous request. We use the Async Request-Reply pattern to handle this: The API receives the travel plan request and immediately queues a message to Service Bus. It stores an initial task record in Cosmos DB with status queued and returns a taskId to the client. A continuous WebJob (running as a separate process on the same App Service plan) picks up the message, executes the full multi-agent workflow, and writes the result back to Cosmos DB. The client polls the API for status updates until the task reaches completed . This pattern keeps the API responsive, makes the heavy work retriable (Service Bus handles retries and dead-lettering), and lets the WebJob run independently — you can restart it without affecting the API. We covered this pattern in detail in the previous series, so we won't repeat the plumbing here. Deploy with azd The repo is wired up with the Azure Developer CLI for one-command provisioning and deployment: git clone https://github.com/seligj95/app-service-multi-agent-maf-otel.git cd app-service-multi-agent-maf-otel azd auth login azd up azd up provisions the following resources via Bicep: Azure App Service (P0v4 Windows) with a continuous WebJob Azure Service Bus namespace and queue Azure Cosmos DB account, database, and containers Azure AI Services (Azure OpenAI with GPT-4o deployment) Application Insights and Log Analytics workspace Managed Identity with all necessary role assignments After deployment completes, azd outputs the App Service URL. Open it in your browser, fill in the travel form, and watch six agents collaborate on your trip plan in real time. What's Next We now have a production-ready multi-agent app running on App Service with the GA Microsoft Agent Framework. But how do you actually observe what these agents are doing? When six agents are making LLM calls, invoking tools, and passing context between phases — you need visibility into every step. In the next post, we'll dive deep into how we instrumented these agents with OpenTelemetry and the new Agents (Preview) view in Application Insights — giving you full visibility into agent runs, token usage, tool calls, and model performance. You already saw the .UseOpenTelemetry() call in the builder pipeline; Blog 2 shows what that telemetry looks like end to end and how to light up the new Agents experience in the Azure portal. Stay tuned! Resources Sample repo — app-service-multi-agent-maf-otel Microsoft Agent Framework 1.0 GA Announcement Microsoft Agent Framework Documentation Previous Series — Part 3: Client-Side Multi-Agent Orchestration on App Service Microsoft.Extensions.AI Documentation Azure App Service Documentation Blog 2: Monitor AI Agents on App Service with OpenTelemetry and the New Application Insights Agents View Blog 3: Govern AI Agents on App Service with the Microsoft Agent Governance Toolkit1.2KViews0likes0Comments