serverless
308 TopicsHow to build long-running MCP tools on Azure Functions
Recently, a customer building servers with the Azure Functions MCP extension reached out and asked: How do I handle tools that take longer than the client is willing to wait? This becomes especially relevant when tool calls move beyond simple request/response into multi-step workflows and long-running operations. At the same time, MCP is evolving to address exactly this. The Tasks extension is introduced in the 2026-07-28 release candidate, defining a standard way to model long-running work. In this post, weβll walk through how to build long-running MCP tools on Azure Functions using Durable Functions , a framework for authoring stateful, long-running workflows as ordinary code, with checkpointing, scaling, and recovery handled automatically. MCP tools today Today, MCP tools are fundamentally request/response: the client issues a tools/call the server returns a result This works well for fast operations, but breaks down when: workflows take minutes execution depends on multiple steps latency is unpredictable In practice, clients enforce their own tool-call timeouts. These aren't standardized by the MCP spec and vary per client, but they're often in the ~30β60 second range. If a tool exceeds that window: In practice, clients often enforce short timeouts. If a tool exceeds that window: the client times out the agent observes a failed call the underlying work may still be running So the core issue is that you have synchronous tool calls donβt naturally model long-running work. The MCP Tasks extension The Tasks extension to address this. With the extension, a server can respond to a tools/call with an asynchronous task handle instead of a final result, and the client drives the lifecycle from there: tasks/get: poll the task's status tasks/update: submit input back to the server if the task reaches input_required tasks/cancel: cancel an in-flight task A task carries a status ("working", "input_required", "completed", "failed", or "cancelled") and on completion, the final result. Task creation is server-directed: the client advertises support by including the extension in its per-request capabilities, and the server decides per request whether to return a task. A server won't return a task to a client that hasn't advertised support. It's important to note that Tasks rely on ecosystem support. Clients must advertise the extension, and MCP SDKs must implement the task lifecycle, before servers can use it. So while Tasks is now a defined extension, broad client and SDK support is still in progress. Implement long-runng tasks with Durable Functions today Until the Tasks extension is broadly supported across clients, we need a pattern that works with existing request/response clients and supports long-running execution. The following samples show how, using Durable Functions: Python NET The long-running work in this sample mines a short chain of blocks. Each block requires solving a computational puzzle where the system keeps trying different inputs until it finds one that produces a result matching a specific pattern (for example, starting with a certain number of zeros). Because this involves lots of trial and error, it naturally takes time, making it a good example of a long-running workflow. The server in the sample exposes two tools: start_mining Starts a Durable Functions orchestration to mine the blocks Waits briefly (within a configurable budget) Returns result inline if completed within budget OR returns workflow_id if still running get_mining_result Takes the workflow_id Returns the current state, e.g. "completed", "running", "failed", or "not_found" To ensure that the agent calls the tools in the right order, workflow_id is a required parameter of get_mining_result, so the agent can't poll without starting a mining run first. Also, the "running" response carries a poll_after_seconds and a next instruction, ensuring the agent to poll again if work is not done rather than give up or assume completion. Even so, the poll path still relies on the agent correctly remembering, and not hallucinating, the workflow_id it was handed. If it garbles or invents an id, the poll lands on the wrong instance or none at all (which is why get_mining_result returns "not_found" rather than guessing). What changes with the Tasks extension Once the Tasks extension is fully implemented across clients and SDKs, the model becomes simpler and more reliable: the server returns a Task handle, the client manages the polling and lifecyle calls, and the SDK tracks execution state. This removes a key limitation of todayβs solution, which requires the agent to remember and correctly pass identifiers like workflow_id. Call to action Try out the sample and let us know whether it addresses your MCP needs around long-running or workflow type tools!11Views0likes0CommentsAzure Container Apps Express for Shipping Container Apps Fast
ACA Express Apps are a strong fit for teams that need to ship quickly and can't afford long platform setup cycles. This includes startups, internal platform teams, and product groups deploying APIs, web apps, or agent endpoints that scale with uneven demand. If the priority is fast path-to-production, predictable wake-up behavior, and minimal infrastructure overhead, this model is likely the right choice. To put real numbers behind that, I built a live demo that races Express against a Consumption environment on the same app. The measurements below come from that demo, not from a spec sheet. MicroVMs make cold starts practical Cold start delays usually come from rebuilding runtime state whenever an app wakes up. ACA Express Apps reduce that overhead with MicroVM-based startup paths built for fast boot and isolation. The result is faster instance readiness without trading off security. The gap shows up clearly when both apps have scaled all the way to zero. Waking from a genuine cold start, Express comes back in about 1.5 seconds. The same app in a Consumption environment takes about 20 seconds to answer the first request. Both were measured live in the browser, from request to first response. Disk and memory state restore is the speed multiplier State restoration skips the app's internal boot sequence entirely. Instead of replaying the same initialization work on every start, ACA Express Apps can restore disk and memory state so the app starts closer to ready. That reduces time-to-first-request and smooths scale events, especially for framework-heavy workloads. It's also what lets scale-to-zero stay practical: the app costs nothing while idle, but the wake-up penalty stays in the low single-digit seconds instead of the tens of seconds you'd otherwise pay. Environmentless changes the deployment experience Skipping the environment setup completely changes the deployment workflow. Teams can ship the container app without first managing environment sprawl, while still getting the runtime foundations they need. For fast-moving teams, that means less setup overhead and a shorter path to production. You can see how little there is to fill in. Creating an Express app is a single short form. There is no environment to stand up first. And once it's created, the manage view gives you the live URL, status, and the basics you need to operate it. The numbers, side by side Everything below was measured on the same container image, in the West Central US region. What's measured Express Consumption Cold start from zero (request to first response) ~1.5 s ~20 s Environment provisioning ~14 s ~120 s First-time deploy (environment + app, zero to live URL) ~52 s ~166 s App deploy only (environment already exists) ~30 s ~30 s Express is much faster on the two steps that build infrastructure from scratch: cold start and environment provisioning. Once an environment already exists, the two are about the same. Express isn't a different app runtime, it's the same platform with the first-time setup cost stripped down. Get started Express is in public preview. You can have a container on a live URL in the time it takes to read this post. π Azure Container Apps Express overview β concepts, capabilities, and the current feature support matrix. π Create your first Express app β the CLI commands and portal steps to get an app running. π οΈ New Container Apps portal β create and manage Express apps in the streamlined UI. π§ͺ Test Express apps locally β validate your container before you deploy. β Express FAQ β preview status, limits, regions, and how Express relates to standard Container Apps. π Deploy an Express app Β· Read the docs Β· Browse the FAQ When speed matters, ACA Express is the best tool for deploying containers. It skips the platform setup delays without sacrificing reliability under load.103Views1like0CommentsWrite Logic Apps in C#: introducing the Logic Apps Standard SDK
The workflow you always wished you could write in code If you build on Logic Apps Standard, you already know the deal: the runtime is excellent at the unglamorous parts of integration - connecting to systems, retrying, scaling, keeping run history you can actually debug. What you sometimes wanted was a different front door. You're a .NET developer. You live in C#, source control, and pull requests. And for a long time, authoring a workflow meant leaving all of that behind for a visual designer and a JSON file. That's the gap the new Logic Apps Standard SDK closes. It lets you define Logic Apps Standard workflows in code - strongly typed, IntelliSense-guided C# - without giving up a single thing the runtime already does for you. What is the Logic Apps Standard SDK? The Logic Apps Standard SDK (Microsoft.Azure.Workflows.Sdk) is a NuGet package that gives you a fluent, code-first way to build workflow definitions in C#. Instead of dragging actions onto a canvas, you compose a workflow with method chaining: a trigger, then the actions that follow it, all the way to a response. Worth saying clearly, because people ask: this is a new way to define workflows - not a new runtime. The workflows you write with the SDK compile down to the same definitions and run on the same Logic Apps Standard runtime you use today. Same connectors. Same hosting. Same rich run history and monitoring. You're changing the authoring experience, not the engine underneath it. Why this matters for developers When your workflow lives in C#, it behaves like the rest of your code. A few things fall out of that almost for free: Type safety and IntelliSense - connector operations, triggers, and outputs are discoverable as you type, and the compiler catches mistakes before you run anything. Real source control and reviews - workflows diff like code, get reviewed in pull requests, and version alongside the services they orchestrate. Familiar tooling - refactor, debug with F5, and lean on the .NET ecosystem you already know. Extensibility on your terms β Compose your workflow declaratively with the fluent builder, then drop into plain imperative C# wherever a step needs logic that might be too complex to implement declaratively - loops, branching, a call into your own library, all encapsulated in a step of your workflow - without leaving the file or the language. And it isn't limited to one style of work. The SDK covers both enterprise integration workflows - the connect-systems-and-move-data scenarios Logic Apps is known for - and agentic workflows, where a conversational or autonomous AI agent drives the steps. Both are first-class in the same SDK, built from the same building blocks. There's one more angle worth calling out, because it's becoming hard to ignore: coding agents are simply better at writing imperative code than declarative JSON. And the reason is the same set of guardrails that helps you. Strong typing and a compilation step mean the code an agent produces is syntactically correct out of the gate β the type system and the compiler do the checking, so you don't have to. Layer unit tests on top and you've covered north of 90% of what matters; what's left is integration testing. Getting an LLM to the same level of accuracy against declarative JSON means building dedicated tooling to stand in for everything the compiler gives you for free. With code-first workflows, those guardrails are just there β which makes this a natural fit for an agent-assisted way of building. Getting started Everything here lives in the Logic Apps extension for VS Code. You'll want the Logic Apps Standard VS Code extension version 5.961.10 or later, which includes all the components you need to create code first workflows. Beyond that, the prerequisites are the ones you'd expect - VS Code with the Logic Apps extension, an Azure subscription you can create resources in, and a working comfort with C# and .NET. From a clean start, you're a handful of steps from a running workflow: Create the workspace β launch the Logic Apps extension and choose Create new Logic Apps workspace. Pick a folder, name the workspace and project, and when prompted for the workflow type, choose Logic Apps codeful - that's the code-first option that uses the SDK. Pick a workflow kind - name your first workflow and choose how it runs: Stateful, Autonomous agents (Preview), or Conversational agents (Preview). The agent options are where the agentic scenarios live. Enable connectors - when prompted, select Use connectors from Azure, choose your subscription and resource group, and pick Connection Keys for authentication. Managed identity is still in development, so connection keys are the way in for now. Find your way around - the project opens with Program.cs, which builds and starts the host, plus a workflow file (like workflow1.cs) where your trigger and actions are defined. The SDK compiles those definitions and runs them on the Logic Apps runtime. Run it - press F5 (or right-click Program.cs and pick Overview). The runtime starts locally and an overview page opens where you can fire triggers, watch run history, and inspect inputs and outputs. That last part is worth dwelling on: run history for SDK workflows uses the same rich visual view as designer-built ones. You author in code, but you monitor and troubleshoot exactly as you always have. A look at the capabilities Connectors and triggers Every workflow starts with a trigger and runs a series of actions. The SDK exposes both through two entry points - WorkflowTriggers and WorkflowActions - each split into BuiltIn and Managed. Built-in triggers and actions run directly in the runtime: HTTP request, recurrence, and the conversational agent trigger; actions like Compose, HTTP, Response, and custom code. Managed connectors give you the full Logic Apps connector catalog - Service Bus, SharePoint, SQL, and hundreds more - typed and ready to call. The managed surface is generated from the same connector definitions the designer uses, so the operations you know are right there: // Built-in trigger var trigger = WorkflowTriggers.BuiltIn.CreateHttpTrigger(); // Managed connector action β full catalog, strongly typed var getItems = WorkflowActions.Managed .Sharepointonline("sharepoint") .GetItems( dataset: () => "https://contoso.sharepoint.com", table: () => "orders-list-id") .WithName("GetOrders"); The fluent API streamlines the definition This is where it comes together. You compose a workflow by chaining operations with .Then(...). The shape of your code mirrors the shape of your workflow - read it top to bottom and you read the execution path. trigger .Then(validateOrder) .Then(getOrders) .Then(sendResponse); Control flow is part of the same fluent model. Built-in structures like Condition (if/else) and ForEach - along with Switch, Until, Scope, and Terminate - are just actions you chain in, each taking a small factory for the branch or loop body: var checkTotal = WorkflowActions.BuiltIn.Control.Condition( expression: () => order.Total > 1000, trueBranch: () => requireApproval, falseBranch: () => autoApprove ).WithName("CheckOrderValue"); And ForEach takes the collection to iterate and a factory that builds the body for each item: var processLines = WorkflowActions.BuiltIn.Control.ForEach( items: () => order.LineItems, actions: (item) => new WorkflowBuiltInActions() .Compose(inputs: () => $"Line: {item}").WithName("HandleLine") ).WithName("ProcessLineItems"); Need parallel branches that fan back in? The same Then pattern handles branching and join - no JSON wiring, no run-after blocks to hand-edit. Extending workflows with custom code Some logic doesn't belong in a connector or an expression - it's just code. The CustomCode action lets you drop a real C# method into the middle of a workflow. It receives a WorkflowContext, so you can read the trigger payload or any earlier action's results and return a strongly typed value the next step can use: var enrich = WorkflowActions.BuiltIn.CustomCode<string>(async (context) => { var trigger = await context.GetTriggerResults(); var order = await context.GetActionResults("GetOrders"); // your logic, your libraries, your types return "enriched"; }).WithName("EnrichOrder"); That's the escape hatch that keeps you in flow: when a step needs custom transformation, validation, or a call into your own libraries, you write a method instead of bending an expression to do something it was never meant to. Handling failures: try/catch with run-after Real workflows have to deal with things going wrong, and the SDK gives you the same try/catch shape Logic Apps has always had - expressed in code. The .Then(...) overload takes a FlowStatus[] run-after condition, so a handler runs only when the step before it ends in a status you name. Wrap the risky work in a Scope (your try), then chain a handler that runs after it Failed or TimedOut (your catch): var tryProcess = WorkflowActions.BuiltIn.Control.Scope(() => callPaymentApi.Then(saveOrder) ).WithName("ProcessPayment"); var handleFailure = WorkflowActions.BuiltIn .Compose(inputs: () => "Payment failed β compensating") .WithName("HandleFailure"); trigger .Then(tryProcess) .Then(handleFailure, runAfter: new[] { FlowStatus.Failed, FlowStatus.TimedOut }); The status set is the whole vocabulary: Succeeded, Failed, Skipped, and TimedOut. Combine them however a step needs - a cleanup action that should run no matter what can list every status; a finally is just the union. The same idea scales to fan-in. When several parallel branches converge, the per-predecessor RunAfter overload lets the join wait on each branch independently - so you can require some to succeed and tolerate others failing: leftChain .Join(rightChain) .Then(merge, runAfter: new[] { new RunAfter(leftChain, FlowStatus.Succeeded), new RunAfter(rightChain, FlowStatus.Succeeded), }); Putting it together Here's a small but complete shape - an HTTP-triggered order workflow that validates input, branches on order value, loops over line items, runs custom code, and replies. The core steps live in a Scope so a single failure handler can catch anything that goes wrong, and a clean reply only runs when the work succeeds. Notice it's all one readable chain: namespace LogicApps { using Microsoft.Azure.Workflows.Sdk; using Microsoft.Azure.Workflows.Sdk.Connectors.Msnweather; using System.Net; public class OrderWorkflow : IWorkflowProvider { /// <summary> /// Gets the HTTP request/response workflow definition. /// </summary> public FlowDefinition[] GetWorkflows() { // --- Trigger ---------------------------------------------------- var trigger = WorkflowTriggers.BuiltIn.CreateHttpTrigger(); // --- Managed connector action (full catalog, strongly typed) ---- // Reused verbatim from the confirmed stateful1.cs pattern. var getWeather = WorkflowActions.Managed.Msnweather("msnweather").CurrentWeather( location: () => "98058", units: () => unitsInput.Imperial).WithName("GetWeather"); // --- Custom code: real C# in the middle of the workflow --------- var enrich = WorkflowActions.BuiltIn.CustomCode<string>(async (context) => { var triggerResults = await context.GetTriggerResults(); var weather = await context.GetActionResults("GetWeather"); // your logic, your libraries, your types return "enriched"; }).WithName("EnrichOrder"); // --- ForEach over a collection (control flow via .Control) ------- var processLines = WorkflowActions.BuiltIn.Control.ForEach( items: () => trigger.TriggerOutput.Body["lineItems"], actions: (item) => WorkflowActions.BuiltIn .Compose(inputs: () => $"Line: {item}").WithName("HandleLine") ).WithName("ProcessLineItems"); // --- Condition (if/else) (control flow via .Control) ------------ var checkTotal = WorkflowActions.BuiltIn.Control.Condition( expression: () => true, trueBranch: () => processLines, falseBranch: () => WorkflowActions.BuiltIn .Compose(inputs: () => "Auto-approved").WithName("AutoApprove") ).WithName("CheckOrderValue"); // --- Scope groups the core steps so one handler catches failures - var processOrder = WorkflowActions.BuiltIn.Control.Scope(() => checkTotal .Then(getWeather) .Then(enrich) ).WithName("ProcessOrder"); // --- Responses -------------------------------------------------- var ok = WorkflowActions.BuiltIn.Response( responseBody: () => "Order processed").WithName("Reply"); var failed = WorkflowActions.BuiltIn.Response( statusCode: () => HttpStatusCode.InternalServerError, responseBody: () => "Order failed").WithName("ReplyFailed"); // --- Assemble --------------------------------------------------- // Happy path runs after the Scope Succeeded; the handler runs after // Failed or TimedOut. trigger .Then(processOrder) .Then(ok, runAfter: new[] { FlowStatus.Succeeded }) .Then(failed, runAfter: new[] { FlowStatus.Failed, FlowStatus.TimedOut }); return new[] { WorkflowFactory.CreateStatefulWorkflow("OrderWorkflow", trigger) }; } } } That last stretch is the best-practice shape in miniature: the happy-path Reply runs only after the Scope Succeeded, while a separate handler catches Failed or TimedOut and returns a 500 - no exception plumbing, just run-after conditions. You implement IWorkflowProvider, hand your trigger graph to WorkflowFactory as a stateful, stateless, or agent workflow, and the host registers it. Run it with F5 and the Logic Apps runtime starts locally - same as any Standard project. Before you build: preview realities I'd rather you go in clear-eyed. While the SDK is in public preview, keep these in mind: Service Provider connectors aren't supported yet - that connector type is coming in a future release. Dynamic schemas aren't supported - support is planned. Custom code supports callback methods only - inline lambdas aren't available in this version. Define and name actions before referencing them - name an action before using it as a dependency elsewhere. Managed identity authentication is in development - use connection keys for connectors in the meantime. Try it, and tell us what you think If you've ever wanted your workflows to live where the rest of your code lives - in C#, in source control, in your pull requests - this is for you. Install the Logic Apps extension for VS Code, create a Logic Apps codeful project, and build your first workflow in code. This is a preview, which means your feedback genuinely shapes where it goes - which capabilities come next, where the rough edges are. Bring issues, feature requests and feedback to our GitHub page. I read it. Let's make code-first workflows something you actually want to use. Related content Create Standard workflow projects with the SDK Logic Apps Standard SDK class library1.4KViews3likes2CommentsAzure Databricks at Databricks Data + AI Summit 2026: updates and new announcements
Databricks Data + AI Summit brings together the global data and AI community in San Francisco to share product news, technical breakthroughs, and customer stories. This year, as usual, we have a lot of Azure Databricks announcements, a strong presence across the event, and a continued focus on helping customers put their data to work across analytics, AI, and enable business productivity. Find us at Data and AI Summit As a Legend Sponsor and Databricksβ long-standing strategic partner, Microsoft is joining Databricks Data + AI Summit during the keynote, multiple breakout sessions, and at the Expo booth. We're also engaging with customers 1:1 to hear from you. Satya Nadella will join Ali Ghodsi, CEO Databricks, in a pre-recorded keynote conversation on the importance of data in AI implementation and the deep integrations we co-engineer. We encourage you to visit us at the Microsoft Booth (Booth # 103) on the Expo floor to chat with the Azure Databricks team, see demos, and learn more about the recent announcements. Azure Databricks Breakout Sessions Unlocking the Microsoft Data & AI Ecosystem with Azure Databricks: From Insight to Impact Wednesday, June 17 | 1:50 PM β 2:30 PM PDT | Speaker: Anavi Nahar, Head of Product, Azure Data Lake Storage & Azure Databricks, Microsoft In todayβs data-driven landscape, organizations need more than analyticsβthey need a unified platform that turns raw data into actionable intelligence across the Microsoft ecosystem. This session explores how Azure Databricks serves as the backbone of modern data architecture, integrating with core Microsoft cloud services and platforms to accelerate innovation. Learn how to use Azure Databricks for scalable data engineering, advanced analytics, and AI-driven solutions while enabling real-time collaboration and governance. Through practical examples and architectural patterns, weβll show how to eliminate data silos, optimize performance, and empower teams to deliver insights faster. Zero-Copy Federated Energy Analytics: ADME + Databricks in Action Wednesday, June 17 | 12:40 PM - 1:20 PM PDT | Speaker: Andy Corran, Principal Product Manager, Azure Databricks, Microsoft Oil and gas companies have standardized on Azure Data Manager for Energy (ADME) as their subsurface system of record, but running analytics and AI on that data has meant copying massive datasets into downstream platforms, breaking governance and slowing every workflow that follows. In this jointly developed Microsoft and Databricks session, we introduce a new zeroβcopy, federated path that brings Databricks compute directly to data, with native governance and serverless scale. We walk through the architecture, show the solution in action against live ADME, and share how operators across the industry are accelerating subsurface analytics while keeping ADME as the single source of truth. Unity Catalog External Locations: Extending Governance to OneLake and Beyond Wednesday, Jun 17 | 5:20 PM - 5:40 PM PDT | Speaker: Ljubica Vujovic Boskovic, Senior Product Manager, Databricks In this session, we'll show how External Locations provide a consistent, extensible pattern for connecting Databricks to any storage platform β and walk through what it takes to create External Location for Microsoft OneLake. You'll see the architecture, the setup end-to-end, and a demo reading and writing UC-governed assets directly into OneLake storage without needing to setup any ETL pipelines. Latest announcements We recently announced new ways to build AI apps and agents with Azure Databricks, Copilot Studio, and GitHub Copilot, including authoring Copilot Studio agents that reason over an entire Azure Databricks workspace through one MCP connection. At Microsoft Build, PepsiCo also shared its blueprint for agentic AI, illustrating how Azure Databricks can provide the data foundation for agentic apps. This weekβs announcements make it easier to use Azure Databricks with the Microsoft tools your teams rely on every day, including Microsoft Teams, M365 Copilot, Excel, SharePoint, Power BI, and OneLake: Genie for Microsoft Teams and M365 Copilot (Beta): You can tag Genie in a Teams thread and get a context-aware answer from your Azure Databricks lakehouse without leaving the conversation. Responses are governed by Unity Catalog, so each answer is scoped to what the user is permitted to see. Itβs part of the broader Genie One experience for report generation, reusable agents, low-code apps, and natural-language pipeline design. See it in action in the Databricks + Microsoft co-authored training in AI Skills Navigator Genie in Copilot Cowork (Beta): Available today, Databricks Genie works seamlessly with M365 Copilot Cowork. This integration will allow teams to anchor Coworkβs tasks with the Genie Ontology, bringing trusted data intelligence straight into their workflows Azure Databricks Excel Add-in (Public Preview): This brings governed lakehouse data into Excel without SQL or per-user ODBC setup. Unity Catalog metric views let business logic be defined once and stay consistent across tools, and the add-in supports write-back, so permitted users can push updates from Excel into Databricks. Learn how to set it up. SharePoint Connector (Beta) via Lakeflow Connect. A fully managed connector for point-and-click ingestion pipelines that bring SharePoint content β structured sheets and unstructured PDFs, Word docs, and PowerPoints β into Delta tables, keeping downstream analytics, Genie spaces, and Excel workbooks supplied with current data. Read the documentation here. Azure Databricks OneLake Catalog Federation (Generally Available): The ability to query OneLake data directly from Azure Databricks without pipelines, duplication, or data movement is generally available. This announcement coupled with the Azure Databricks Mirrored Catalog item enable bidirectional READ from Azure Databricks and OneLake. Learn more here Storing Unity Catalog Managed Tables in OneLake (Beta): You can now customers can use OneLake as a storage location option for Unity Catalog tables in addition to Azure Data Lake Storage (ADLS). Read more on how to do this here. CustomerLake: a customer data platform inside the lakehouse Introducing CustomerLake, a Customer Data Platform (CDP) built directly within the lakehouse rather than as a separate application. CustomerLake is now available in Azure Databricks. Two kinds of agents do much of the work: Profile Agents help assemble business-ready Customer 360 profiles from fragmented sources, reducing the manual effort of stitching customer data together. Campaign Agents give marketing teams a workspace to segment audiences, recommend next-best actions, activate across channels, and continuously optimize personalized experiences. Because CustomerLake runs inside your governed storage boundary, customer data, AI models, and governance stay together β avoiding much of the data movement and duplication that come with connecting separate marketing tools. For Azure customers, that means building customer engagement on the same governed lakehouse foundation they already use for analytics and AI, rather than maintaining a parallel stack. βWhat excites us most about the CustomerLake and the new CDP capability is the ability to bring customer data together in a way that is actionable, timely, and scalable. By creating a more complete view of each customer, we can better understand behaviors, preferences, and needs across channels, which will help us deliver more personalized experiences and more relevant offers. Ultimately, we see this as a powerful step toward stronger engagement, deeper loyalty, and better outcomes for both our business and our customers.β Jay Malepati Global Director of Data Science, Circle K All of these announcements benefit from built in Governance with Azure Databricks Unity Catalog. By connecting governed lakehouse data to the Microsoft tools your teams already use β Teams, M365 Copilot, Excel, SharePoint, OneLake, and Power BI β these updates make it easier to put trusted AI to work on Azure. To learn more, explore the Azure Databricks documentation and try these capabilities in your own workspace.929Views1like0CommentsAzure Function App β Queue-Based Architecture for Long-Running Sync Jobs
The Problem: HTTP Triggers and Long-Running Jobs Don't Mix Here's a situation you've probably run into: you have a job that needs to loop over dozens of Azure resources, call APIs, and do real work. You wrap it in an HTTP-triggered Azure Function so it can be called on demand. It works great and after a few minutes, the caller gets a 504 Gateway Timeout. The 230-second limit is enforced by Azure Front Door / the platform load balancer. It cannot be overridden by app settings or host configuration. Any HTTP trigger that runs longer than ~3.5 minutes will timeout for the caller. In our case, the job iterates over 30+ Azure subscriptions β for each one it switches context, lists resources, and triggers image imports. Total runtime: anywhere from 2 to 10 minutes depending on how many ACRs need updating. Way over the limit. The Solution: Decouple Request from Execution via a Queue The fix is clean once you see it: the HTTP trigger shouldn't do the work β it should just accept the work and hand it off. That's what a queue is for. The flow splits into two independent phases: Request phase β The HTTP trigger validates the caller (JWT + app role check), packages the job parameters into a queue message, and returns 202 Accepted. This takes under 3 seconds. Execution phase β A Queue Trigger picks up the message and runs the actual sync. No HTTP connection involved, so there's no timeout. On a Dedicated (P-series) plan, execution time is unlimited. Approach What the caller gets Result HTTP trigger β run sync inline Waits for the full job to complete 504 TIMEOUT after 230 seconds HTTP trigger β Queue β Queue Trigger 202 Accepted immediately NO TIMEOUT job runs as long as needed π€ΈββοΈThere's an added bonus - Reliability in Azure Queue Storage: Azure Storage Queues give you automatic retry out of the box. If the job crashes halfway through, the message becomes visible again after a visibility timeout and the Queue Trigger picks it up for a retry β up to 5 attempts before the message is moved to the poison queue. No retry logic to write π€ΈββοΈ. Locking Down the Endpoint Since the HTTP trigger is the public entry point, it needs solid auth. We layer two things: βUse EasyAuth for the "is this a real Entra ID token?" check, and a custom App Role for the "is this person allowed to trigger syncs?" check. These are independent concerns and should stay that way. Layer What it does How EasyAuth (Entra ID) Rejects requests without a valid Entra ID Bearer token β before your code even runs Configured at the Function App level via the Authentication blade App Role check Validates that the token contains the SyncJob.Execute role β only assigned users/SPs can trigger the job Decoded in the function code from the JWT roles claim Managed Identity Authenticates the Function App to Azure APIs (no credentials in code) Connect-AzAccount -Identity β identity assigned via RBAC One gotcha worth knowing: when using v2 tokens (which is the default with modern App Registrations), the aud claim in the token is the raw App ID GUID β not the api:// prefixed URI. You need to explicitly add both forms to your allowedAudiences in EasyAuth, otherwise valid tokens get rejected. APP_ID="<your-app-id>" TENANT_ID="<your-tenant-id>" FUNCTION_APP_URL="https://<your-function-app>.azurewebsites.net" # Interactive login (device code flow β works from any terminal) az login --tenant "${TENANT_ID}" \ --scope "api://${APP_ID}/.default" \ --use-device-code TOKEN=$(az account get-access-token \ --scope "api://${APP_ID}/.default" \ --query accessToken -o tsv) # Trigger the sync β returns 202 immediately curl -s -X POST "${FUNCTION_APP_URL}/api/SyncContainerRegistryHttpTrigger" \ -H "Authorization: Bearer ${TOKEN}" \ -H "Content-Type: application/json" Passing Parameters Through the Queue One nice property of this pattern: the queue message is just JSON, so you can pass whatever parameters the job needs. In our case, we pass a subscriptionFilter wildcard so callers can target a subset of subscriptions without touching any code. The parameter travels the full chain: HTTP body β queue message β Queue Trigger β PowerShell script parameter. Here's how each step handles it. Step 1 β HTTP Trigger reads the body and enqueues the message using the Push-OutputBinding output binding. Azure Functions wires the binding to the queue automatically β no SDK call needed: param($Request, $TriggerMetadata) # ... decode the JWT, check role assignment $queuePayload = @{ triggeredBy = $decoded.Payload.upn ?? $decoded.Payload.oid triggeredAt = (Get-Date -Format 'o') subscriptionFilter = if ($body.subscriptionFilter) { $body.subscriptionFilter } else { "*" } } | ConvertTo-Json -Compress Push-OutputBinding -Name QueueMessage -Value $queuePayload Push-OutputBinding -Name Response -Value ([HttpResponseContext]@{ StatusCode = [System.Net.HttpStatusCode]::Accepted Body = @{ message = "Sync job queued. Check Azure Monitor logs for execution status." } }) βPush-OutputBinding is how Azure Functions PowerShell workers write to output bindings (queues, blobs, HTTP responsesβ¦). The binding name QueueMessage maps to the queue defined in function.json β the runtime handles serialisation and delivery. Step 2 β Queue Trigger passes the filter to the script as a named parameter: param($QueueItem, $TriggerMetadata) Write-Host "Triggered SyncContainerRegistry via Storage Queue. Payload: $QueueItem" $subscriptionFilter = if ($QueueItem.subscriptionFilter) { $QueueItem.subscriptionFilter } else { "*" } $SubscriptionFilter = $subscriptionFilter . "$PSScriptRoot/../SyncContainerRegistry/run.ps1" Step 3 β Long running job with the filter as parameter: param($Timer) if (-not $SubscriptionFilter) { $SubscriptionFilter = "*" } $subscriptions = Get-AzSubscription | Where-Object { $_.Name -like $SubscriptionFilter } foreach ($subscription in $subscriptions) { Set-AzContext -SubscriptionId $subscription.Id | Out-Null # ... do the work } Targeting a subset of subscriptions # Sync all subscriptions (default β omit the body) curl -s -X POST "${FUNCTION_APP_URL}/api/SyncContainerRegistryHttpTrigger" \ -H "Authorization: Bearer ${TOKEN}" \ -H "Content-Type: application/json" # Sync only subscriptions matching a pattern curl -s -X POST "${FUNCTION_APP_URL}/api/SyncContainerRegistryHttpTrigger" \ -H "Authorization: Bearer ${TOKEN}" \ -H "Content-Type: application/json" \ -d '{"subscriptionFilter": "*project-alpha*"}' βPowerShell's -like operator uses * as a wildcard anywhere in the string. The pattern *project-alpha* matches sub-mycompany-project-alpha-prd, sub-mycompany-project-alpha-dev, etc. A pattern without a leading * only matches from the start of the string β keep this in mind when naming subscriptions. Pushing a Message Directly via PowerShell You can also push a message straight to the queue without going through the HTTP trigger β useful for testing, scripting, or bypassing the auth layer in a controlled environment. Connect-AzAccount # or -Identity for a Managed Identity context $storageAccount = "<your-storage-account>" $queueName = "sync-job-queue" # Build the payload β same shape the HTTP trigger produces $payload = @{ triggeredBy = $env:USERNAME triggeredAt = (Get-Date -Format 'o') subscriptionFilter = "*project-alpha*" # or "*" for all } | ConvertTo-Json -Compress # Get a queue client via the connected account (no key needed) $ctx = New-AzStorageContext -StorageAccountName $storageAccount -UseConnectedAccount $queue = Get-AzStorageQueue -Name $queueName -Context $ctx $queue.QueueClient.SendMessage($payload) β -UseConnectedAccount authenticates via the current Connect-AzAccount session β no storage key required, as long as your identity has the Storage Queue Data Message Sender role on the storage account. The Queue Message The HTTP trigger packages the caller identity and filter into a simple JSON payload before enqueuing. The Queue Trigger reads it back as a deserialised PowerShell object β no manual JSON parsing needed. { "triggeredBy": "user@company.com", "triggeredAt": "2026-06-01T11:03:55.570+02:00", "subscriptionFilter": "*project-alpha*" } Design Decisions at a Glance Decision Choice Why Async execution Azure Storage Queue HTTP trigger has a hard 230s timeout. The sync job takes 2β10 minutes. The queue decouples acceptance from execution β and gives us retry for free. Authentication EasyAuth + App Role No credentials in code. Access is controlled via Entra ID app roles β revocable per user without touching infrastructure. Azure identity Managed Identity No secrets to rotate or store. The Function App authenticates to Azure APIs using its platform-assigned identity. Job parameter Wildcard filter via queue payload Lets callers target any subscription subset without code changes. The filter travels through the queue β the Queue Trigger just passes it along. Hosting plan Dedicated (P-series) Consumption plan caps function execution at 10 minutes. A Dedicated plan has no execution time limit β essential when the job can run longer. See you in the Cloud JamesdldIntroducing Azure Container Apps Sandboxes: Secure Infrastructure for Agentic Workloads
Today we are announcing the public preview of Azure Container Apps Sandboxes - a new first-class resource type that gives you fast, secure, ephemeral compute environments with built-in suspend and resume. This is the underlying infrastructure on which products like Cloud sandboxes in GitHub Copilot, Foundry Hosted Agents, and Azure Container Apps Express are built, you now have the opportunity to build your solutions leveraging this infrastructure. Azure Container Apps Sandboxes unlocks two massive opportunities. For platform developers and ISVs, sandboxes give you the same isolated compute fabric that powers many Microsoft products. You get the building blocks to create your own multi-tenant platform on proven, enterprise-scale infrastructure. For AI agents, sandboxes become a self-configurable tool that lets agents extend their own capabilities on the fly. An agent can spin up a fresh sandbox in milliseconds and use it to execute untrusted code, compile source, test HTTP requests against a live app, launch a browser session, or tackle whatever needs a quick and scalable infrastructure. On one side it empowers humans to build platforms, on the other it empowers agents to build their own capabilities. Both get enterprise-grade isolation, instant startup, and snapshot-based persistence out of the box. We'll walk through the resource model, sandbox lifecycle, the features that set Sandboxes apart - like snapshots, lifecycle policies, network egress controls, volumes, and managed identities - and show you how to get started with the portal and CLI. What Are Container Apps Sandboxes? Container Apps Sandboxes are secure, isolated compute environments that start in sub-second time, scale to thousands, and cost nothing when idle. Each sandbox runs in its own hardware-isolated microVM boundary - fully separated from the host, the platform, and every other sandbox. You bring your own Open Container Initiative (OCI) image, and Sandboxes handle the rest: provisioning from prewarmed pools, strong multi-tenant isolation, and snapshot-based suspend/resume that preserves full memory and disk state across sessions. There are many ways Sandboxes can help you build your next project - here are a few: Your own build & test systems - wire a Sandbox into your CI/CD flow to run builds while your laptop stays cool. Agents that can run anything safely - an agent spawns a sandbox, executes work inside it, and returns the output with no agent host privileges required. Agent swarms - decompose a research question, spawn N sandbox workers in parallel (each pinned to its own image and egress policy), and synthesize the result. Early access customers are already unlocking significant benefits by leveraging Azure Container Apps Sandboxes. "With Azure Container Apps sandboxes, SitecoreAI can safely enable agents to take real action. The combination of multi-tenant isolation, rapid scale-out, and full automation allows Sitecore to run long-lived, autonomous agents that securely execute code, manage workflows, and interact with enterprise systems within secure, governed environments. With this foundation, we can build agents that do real work: assembling content, personalizing experiences, and optimizing campaigns in production. Agents that operate continuously, learn from results, and improve over time, so our customers get better outcomes without giving up control." - Mo Cherif, VP of AI and Innovation, Sitecore "We got early access to Azure Container Apps Sandboxes, and got the first prototype integrated with Atlas AI in hours, and it's already shaping a new Atlas AI capability that we plan to launch in preview in Q3. It gives every Atlas AI agent a safe, sandboxed workspace (file system, terminal, code execution) on a customer's live data in Cognite Data Fusion. The value: Industrial process, reliability, and production engineers spend days and weeks on questions like "which wells are underperforming and why?" These questions are tractable but expensive, so they are asked rarely and decisions are made on gut feel. With this, an agent pulls the data, runs the analysis, cross-references maintenance and inspection records, and returns a cited draft in minutes. Sandboxes make it practical: Aligned feature set, per-customer isolation, pause/resume across multi-day investigations, scale-to-zero economics." - Kelvin Sundli, Product manager, Atlas AI, Cognite Resource Model: Sandbox Groups and Sandboxes The top-level ARM resource is Microsoft.App/SandboxGroups. A Sandbox Group is the management boundary for a collection of sandboxes that share configuration - think of it like a Container Apps Environment, but purpose-built for sandboxes. When you create a Sandbox Group, you specify: Subscription, Resource Group, and Region Sandbox defaults (optional): default CPU, memory, disk, max sandbox count, and default idle timeout Networking: optionally deploy into a custom VNet with a dedicated subnet for private networking Identity: System or user assigned Entra identity. Individual sandboxes are created within a Sandbox Group. Each sandbox has its own source (disk image or snapshot), resource tier, lifecycle policy, network egress policy, environment variables, ports, volumes, and connections. Sandbox Lifecycle Sandboxes have a well-defined lifecycle with the following states: State Description Creating Provisioning the sandbox from a disk image or snapshot Running Actively executing - backed by a live microVM Idle System-suspended after inactivity; can auto-resume on the next request Suspended Full state (memory + disk) preserved as a snapshot; no compute costs Resuming Restoring from a suspended or idle state - sub-second for most workloads Stopped User-initiated stop; can be resumed Stopping Graceful shutdown in progress Deleting Teardown in progress The key insight here is the distinction between Idle and Suspended. When a sandbox goes idle (e.g., no traffic for a configured timeout), the system can automatically suspend it and capture a snapshot. When a new request arrives, the sandbox resumes transparently. This gives you scale-to-zero economics with stateful compute - something that wasn't possible before without significant custom engineering. Disk Images: Bring Your Own Container Sandboxes boot from Disk Images - Open Container Initiative (OCI) images converted into an optimized root filesystem format. You point to any OCI image (public or private registry), and the platform builds a bootable disk image from it. You can start with public, pre-built images maintained by the platform (for example, Ubuntu base images), or bring your own private images. For private registries, you can authenticate with username/token or use a user-assigned managed identity for Azure Container Registry (ACR) β integrated with Azure as you expect. Snapshots: Full-State Persistence Snapshots capture the complete state of a running sandbox - memory, disk, and all running processes. When you resume a sandbox from a snapshot, every process, open file handle, and in-memory data structure is restored exactly as it was. A snapshot captures the full state of a running sandbox: memory pages, disk, processes. Two ways to make one - automatically on suspend, or manually on demand. Three things they're great for: Checkpointingβ―mid-task so a long-running agent can resume exactly where it left off Cloningβ―an environment that's already warm - dependencies installed, caches populated, services running Shippingβ―a "ready-to-go" state that resumes in sub-second instead of cold-booting Snapshots are free during the preview, after which they will be stored as Azure Blob Storage at standard rates. Each snapshot records the source sandbox, resource allocation (CPU, memory, disk), and container metadata - so what you get back is exactly what you snapshotted. Resource Tiers Every sandbox is assigned to a resource tier that determines its CPU, memory, and disk allocation: Tier CPU Memory Disk XS 0.25 vCPU 0.5 GB 5 GB S 0.5 vCPU 1 GB 10 GB M (default) 1vCPU 2 GB 20 GB L 2 vCPU 4 GB 40 GB XL 4 vCPU 8 GB 80 GB When creating a sandbox from a snapshot, the resource tier is inherited from the snapshot and cannot be changed - this ensures the restored environment has the exact resources it was running with when the snapshot was taken. Lifecycle Policies: Auto-Suspend and Auto-Delete Every sandbox can be configured with lifecycle policies that automate state transitions and cleanup: Auto-Suspend Idle timeout: How long a sandbox can sit idle before being suspended (configurable: 1m, 2m, 5m, 10m, 30m, 60m) Suspend mode: Disk + Memory (default): Full snapshot including memory state - resume picks up exactly where you left off, with all processes and in-memory data intact. Disk: Only the disk is preserved; the VM restarts fresh on resume. Useful when you only need file persistence, not process continuity. Auto-Delete Automatically delete sandboxes after a configurable number of days of inactivity Prevents accumulation of abandoned sandboxes that consume snapshot storage These lifecycle policies are what make Sandboxes economically viable at scale. A platform serving thousands of tenants can configure aggressive idle timeouts (say, 60 seconds) with Memory suspend mode, and each tenant's sandbox disappears from the billing meter almost immediately - but resumes in sub-second time the moment they return. Network Egress Policy For scenarios involving untrusted code - AI agents executing LLM-generated scripts, multi-tenant SaaS with user-submitted workloads - controlling outbound network access is critical. Sandboxes provide a per-sandbox Network Egress Policy: Default action: Allow or Deny all outbound traffic Host rules: Domain-pattern rules (e.g., *.github.com β Allow) to permit specific destinations Custom CIDR rules: Network-level rules for IP ranges (e.g., 10.0.0.0/8 β Deny) Skip egress proxy: Option to bypass the egress proxy entirely when custom VNet routing handles policy enforcement This means you can run a sandbox in a deny-by-default posture and allowlist only the specific endpoints it needs (your API server, a package registry, etc.) - without setting up NSGs or firewall appliances. Managed Volumes: Persistent and Shared Storage Sandboxes support two types of mountable volumes, both managed by Microsoft: Volume Type Backed By Best For Managed Azure Blob Azure Blob Storage Shared data across sandboxes, file uploads/downloads, persistent artifacts Managed Data Disk Azure Disk Storage High-performance storage for databases, build caches, large working sets - only available to one sandbox at a time Blob volumes come with a built-in file explorer in the portal - you can browse, upload, download, create folders, and drag-and-drop files directly. Data Disk volumes provide dedicated block storage with configurable sizes. Secrets and Identity Secrets Sandbox Groups support key-value secrets scoped to the group. Secrets can be created, edited, and referenced by sandboxes within the group. These secrets can be used in egress policies to modify requests with transform or header-injection rules, without exposing the secrets to code running inside the sandbox. Managed Identity Sandbox Groups support both system-assigned and user-assigned managed identities, with full RBAC role assignment management. This means your sandboxes can authenticate to Azure services (Key Vault, Storage, Cosmos DB, etc.) without managing credentials - the same identity model you use everywhere else in Azure. MCP Connectors and Triggers ACA Sandboxes now supports managed connectors through the Model Context Protocol (MCP), giving sandboxes access to external APIs - including Microsoft 365, Salesforce, ServiceNow, GitHub, and 1,400+ other systems - without managing credentials directly. Attach a Connector Gateway to your sandbox group, and every sandbox in the group can call external APIs through a standardized MCP interface at runtime. Pair connectors with triggers to build event-driven automation: route an Outlook email to a sandbox that triages it with an AI agent, or react to a SharePoint file upload by extracting and processing the document all without writing glue code. Triggers can fire a shell command inside a sandbox or invoke an HTTP endpoint the sandbox exposes, so your automation shapes fit naturally around your workload. The integration is built on the new Connector Namespace service (az connector-namespace), the same runtime behind Logic Apps and Power Platform connectors, now available as a programmable layer for sandboxes. See the end-to-end samples for runnable azd up-deployable examples covering email triage and document automation scenarios. The Portal Experience Azure Container Apps Sandboxes are only available in the new Azure Container Apps portal that provides a rich, IDE-like experience for working with sandboxes. Creating a Sandbox The portal offers multiple creation paths: Standard Sandbox - full configuration control over source, resources, lifecycle, networking, and volumes GitHub Copilot Sandboxβ―- preset, Copilot CLI ready to go, GitHub credentials can be wired through the Access Token before the sandbox is created Claude Sandboxβ―- Claude CLI pre-installed, ready for agentic coding inside the sandbox Using Coding Agents (Copilot CLI / Claude Code) If you live inside Copilot CLI or Claude Code, you don't need to learn a new CLI. Install theβ―azure-sandboxβ―skill once and your agent picks up the right skills: # GitHub Copilot CLI # Add as a plugin marketplace /plugin marketplace add microsoft/azure-container-apps # Install all skills /plugin install sandboxes@Azure-Container-Apps # Claude Code claude plugin add microsoft/azure-container-apps The skill runs prerequisite checks silently (az --version, az account show, node --version, aca --version), prompts only if something's missing, and maps natural-language asks to the right aca commands. Bundled runbooks cover Copilot CLI BYOK (bring your own Azure OpenAI key), the deploy-a-web-app walkthrough, and shell setup. Sandbox Detail Page Once your sandbox is running, the detail page gives you immediate access to the sandbox terminal and additional details, such as - Network Audit - real-time egress traffic log showing allowed and denied requests Monitor - live CPU, memory, disk, and network utilization charts Connectors - attached connections with an "Add" action Volumes - mounted volumes with an "Add" action Log Stream - streaming container logs Processes - running process list inside the sandbox Files - file explorer to browse the sandbox filesystem The toolbar actions let you manage the state of the sandbox - Resume or Stop. In the Ellipsis menu (β) you can find additional settings to manage network Egress Policy and ingress (Add port), take a Snapshot of the sandbox, Commit (save disk state as a new disk image), set Lifecycle Policy or permanently Delete the sandbox. Finally, you can see additional Details in a side panel. Getting Started with the CLI and Python SDK All sandbox and sandbox-group operations go through the β―acaβ― CLI. There areβ―noβ―az containerapp sandboxβ―commands,β―-β―azβ―is only used forβ―az login,β―az account show, and resource-group management. Install (CLI) # Mac, Linux curl -fsSL https://aka.ms/aca-cli-install | sh # Windows irm https://aka.ms/aca-cli-install-ps | iex Run aca --help to get started. Install (Python SDK) pip install azure-containerapps-sandbox For more details, quick start and examples on ACA CLI and Python SDK, please go to https://sandboxes.azure.com Evolution from Dynamic Sessions If you've used Azure Container Apps Dynamic Sessions, Sandboxes are the next evolution of that capability. Everything Sessions can do, Sandboxes can do - and significantly more: Capability Dynamic Sessions Sandboxes Sub-second startup β β Strong isolation β β Custom container images β β Custom VNet integration β (Partial) β Suspend/resume with Memory and Disk snapshots - β Lifecycle policies (auto-suspend, auto-delete) - β Network egress policy (per-sandbox) - β Persistent managed volumes (Blob, Data Disk) - β Managed identity (system + user-assigned) - β Secrets management - β Configurable resource tiers - β Direct access to sandbox in Portal experience - β We will continue to support Dynamic Sessions, but all new investment goes into Sandboxes. If you're building new workloads on isolated ephemeral compute, start with Sandboxes. How It All Fits Together ACA Sandboxes is a platform primitive. It's the foundation on which multiple Microsoft products are already built - including ACA Express, Cloud sandboxes in GitHub Copilot, and Foundry Hosted Agents. When you build on Sandboxes, you're building on the same infrastructure that powers Microsoft's own portfolio. This is the evolution of what we shared with Project Legion in 2024. Legion described the internal infrastructure; Sandboxes exposes it as a customer-facing primitive that you can use directly. What's Next β’ Deeper Azure integrations - first-class connectivity with Azure networking, identity, storage, and AI services β’ Enhanced SDK and CLI - richer programmatic experiences for managing sandboxes at scale β’ More Microsoft services built on Sandboxes - this is just the beginning Get Started Today β’ Portal: https://sandboxes.azure.com/ β’ Documentation: Azure Container Apps Sandboxes β’ Pricing: Azure Container Apps Pricing (per-second vCPU/memory billing, scale-to-zero, snapshots at Blob Storage rates) We'd love to hear your feedback. You can ask questions, or file issues on the Azure Container Apps GitHub (prefix with [Sandbox] for Sandboxes-specific issues).4.6KViews3likes1CommentIntroducing the Azure Functions serverless agents runtime (preview)
We're thrilled to announce the Azure Functions serverless agents runtime, now in public preview. It brings a new, markdown-first programming model for building AI agents as a first-class workload on Azure Functions, with the event-driven triggers, scale-to-zero economics, and operational integrations you know and love from the platform. A few things you could build in a matter of minutes: A daily briefing agent that wakes up on a timer, scours the web, and drops a summary in your Outlook inbox every morning. A Teams chat agent that triggers on every message and answers your team's questions, looking up data across your connected systems. An on-call troubleshooting agent that investigates incidents by querying logs in Azure Data Explorer and reports back what it found. Each one is a single markdown file with instructions plus a trigger, and deployed like any other function app and running on the Flex Consumption plan. Why a serverless agents runtime Building production agents today usually means stitching together a framework, a hosting layer, message queues, identity, secrets, observability, and a long list of per-service integrations. Most of that work is plumbing, not the agent. Azure Functions has spent years making event-driven compute simple: declare a trigger, write the handler, get autoscale and managed identity for free. The serverless agents runtime applies that same model to agents: Agents are the unit of work. You define behavior in natural language, not boilerplate. Trigger agents from almost any event. HTTP requests, timers, queues, database changes, Teams messages, Outlook mail, and more. Tools, MCP servers, connectors, and sandboxed execution are declared, not coded. Deploy and operate like any function app. Flex Consumption for scale-to-zero and per-second billing, managed identity, VNet integration, Application Insights, and the same deployment tools you already use. Markdown-first: what an agent looks like An agent is a .agent.md file. Your app can have multiple agents, each with its own metadata that declares the trigger. The markdown body becomes the agent's instructions. Here's a timer-triggered agent that summarizes the day's tech news and emails it: --- name: Daily Tech News Email description: Fetches top tech news and emails a summary daily. trigger: type: timer_trigger args: schedule: "0 0 15 * * *" --- You are a news assistant. When triggered, do the following: 1. Scour the web for today's top tech news headlines. Use reputable sources; Include links to the original articles. 2. Summarize the top stories in a concise, well-formatted HTML email body. 3. Email the summary to $TO_EMAIL with the subject "Daily Tech News Summary" followed by today's date. That's the whole function. Drop the file into your app, deploy, and it runs on the schedule. No framework wiring, no service-specific integration code. Your agents can share configuration and capabilities through a few files alongside the agent definitions. agents.config.yaml declares system tools and the default model. mcp.json lists the MCP servers your agents can call, including MCP-enabled Azure connections. A /tools folder holds custom Python tools and a /skills folder holds reusable prompt fragments. Everything here is optional and available to every agent automatically when present. In this example, the agent uses a Container Apps dynamic session to browse the web with Playwright, and a Microsoft Office 365 connection (exposed as an MCP server) to send the email: # agents.config.yaml system_tools: dynamic_sessions_code_interpreter: endpoint: $ACA_SESSION_POOL_ENDPOINT model: $AZURE_OPENAI_DEPLOYMENT // mcp.json { "servers": { "office365": { "type": "http", "url": "$MICROSOFT_365_CONNECTION_MCP_ENDPOINT", "auth": { "scope": "https://apihub.azure.com/.default" } } } } The function app's managed identity authenticates to the connection's MCP endpoint, so there are no secrets to manage. Any Azure connector that supports MCP, or any remote MCP server, can be added the same way. Any of these global settings can be overridden per agent in the agent's metadata. What you get in the preview Triggers across the Azure Functions catalog. HTTP, Timer, Queue, Service Bus, Event Hubs, Cosmos DB, Blob, Event Grid, plus new connection-backed triggers like Teams messages, Outlook mail, and calendar events. 1,400+ Azure connectors as tools. Create a connection, enable its MCP endpoint, and an agent can send mail, post to Teams, create records, query data, all without integration code or auth plumbing. Any remote MCP server as tools. Use any remote MCP server. Sandboxed code and browser automation. Run code or a Playwright-powered browser in Azure Container Apps dynamic sessions, isolated per agent session. Built-in chat UI, HTTP API, and MCP server endpoint with no extra code. Custom Python tools in a tools/ folder and reusable skills in a skills/ folder, shared across agents. Pluggable model providers. Microsoft Foundry, Azure OpenAI, and OpenAI out of the box. Where this fits The serverless agents runtime is designed for the agents most enterprises actually need to build: Scheduled background agents that summarize, monitor, or reconcile on a timer. Event-driven assistants that react to messages, emails, alerts, and database changes. Cross-system agents that tie multiple SaaS and enterprise apps together through connections. Trigger with a Teams message, look up the customer in Salesforce, send an email, and update a database record, all from one agent. Conversational front-ends that pair an HTTP or chat-UI entry point with the same agents your event triggers invoke. Agents as MCP servers that other agents and MCP clients can integrate with directly. We want your feedback The serverless agents runtime is in public preview, and we're actively building it out with input from real customer workloads. Tell us what you build, what's missing, and where the model should go next. Get started Docs: aka.ms/azure-functions-agents-docs Building agents on Azure Functions has never been easier. We can't wait to see what you create with the serverless agents runtime!2.3KViews1like0CommentsAzure Functions MCP Extension: What's New at Build 2026
The Azure Functions MCP extension has had a breakout year! Since its initial preview, the extension has grown from a single trigger type into a full-featured platform for building remote MCP servers: with tool, resource, and prompt triggers across multiple languages, MCP Apps for interactive UIs, built-in MCP authentication, and feature enhancements. Here's what's new and what it means for developers building MCP servers on Azure Functions. The full MCP primitive set: Tools, resources, and prompts When the MCP extension first shipped, it supported tool triggers. Declare a function as an MCP tool, and any MCP client can discover and call it. That was the starting point. Since then, we've shipped the remaining MCP primitives: Resource triggers: expose a function as an MCP resource. Prompt triggers: expose a function as an MCP prompt, letting clients request structured prompt templates from your server. Like tool triggers, resource and prompt triggers are supported in multiple languages including .NET, Java, Python, TypeScript, and JavaScript. MCP Apps: interactive UI from your MCP server MCP Apps let your tools return interactive user interfaces instead of plain text. Combine tool triggers with resource triggers, and your MCP server can serve rich, rendered experiences to MCP-aware clients. The Azure Functions MCP extension supports MCP Apps natively, meaning the same function app that exposes tools and resources can also serve UI components. The launch blog post on the Azure Apps Blog walked through the pattern in detail. For .NET developers, the new fluent builder API (available in the latest NuGet release) makes it easier to compose MCP Apps by chaining tool and resource definitions in a declarative style. MCP authentication The extension supports built-in MCP authentication, implementing the requirements of the MCP auth spec. All samples in the aka.ms/remote-mcp repo enable built-in MCP auth by default with Microsoft Entra ID as the identity provider. Samples have also been updated to demonstrate how to exchange tokens in the On-Behalf-Of (OBO) flow, so your MCP tools can access downstream APIs using the invoking user's identity. Auth configuration in the Azure portal: Preview at Build is a one-click experience in the Azure portal for configuring built-in MCP auth. No more manual app registration creating, configuration and wiring to the server. Just open your server app on the portal and click to enable MCP auth. Try it out! Feature enhancements Beyond the headline primitives and auth, the extension has shipped a steady stream of capabilities the past few months. The following are the notable additions. Structured content Structured content lets you return machine-readable JSON metadata alongside your tool's response via the `structuredContent` field. Clients that support it can programmatically consume the data (e.g. parse fields, render tables, drive downstream logic) rather than just displaying text. Clients that don't support it still get the regular content blocks as a fallback. Rich content types Tools aren't limited to returning plain text. The extension supports the full set of MCP content block types, e.g. `TextContent`, `ImageContent`, `AudioContent`, `ResourceLink`, and `EmbeddedResource`, so your tools can return images, audio clips, references to resources, and inline file content alongside text. Input and output schemas `WithInputSchema` and `WithOutputSchema` give you explicit control over the JSON schemas advertised for your tools. This is especially useful when the auto-generated schema from function parameters doesn't capture the full contract. For example, when your tool accepts a complex nested object or returns a specific shape that clients depend on. Input and output schemas are currently supported in .NET, with support for other languages coming soon. builder.ConfigureMcpTool("SearchDocs") .WithOutputSchema(""" { "type": "object", "properties": { "results": { "type": "array", "items": { "type": "string" } }, "query": { "type": "string" } }, "required": ["results", "query"] } """); Fluent configuration APIs in .NET A set of fluent builder APIs that let you configure MCP primitives declaratively in `Program.cs`: ConfigureMcpTool: add properties, metadata, input/output schemas, or promote a tool to an MCP App ConfigureMcpResource: attach metadata to resources ConfigureMcpPrompt: define prompt arguments and metadata builder.ConfigureMcpTool("sayhello") .WithProperty("name", McpToolPropertyType.String, "Name of the user", required: true) .WithMetadata("ui", new { resourceUri = "ui://index.html" }); What's next Usage of the MCP extension has grown steadily since its preview launch. Tool execution volume has increased 15x over the past several months as more customers move from experimentation to production. As adoption grows, so do the expectations. Developers building production MCP servers are hitting real friction around auth complexity, client configuration, and observability. We're continuing to invest in the extension to address these gaps and help customers be more successful building and hosting MCP servers on Azure Functions. Here's where we're focusing next. Continued auth simplification Auth remains the biggest barrier to getting an MCP server into production. We'll work on: Smoother client setup: making it easier to connect any MCP client to an authenticated Azure Functions MCP server, not just VS Code. Simplified OBO flow: streamlining the experience of On-Behalf-Of authentication so developers can delegate user identity to downstream services with less configuration. Our goal: the secure path should be the easy path. Deeper integration with Microsoft Foundry We'll build tighter integration between Azure Functions MCP servers and Microsoft Foundry. This includes surfacing MCP servers in Foundry Toolbox, a new feature introduced to help Foundry agents discover and consume tools from a single endpoint. Developers will be able to publish an MCP server from Functions and have it available to Foundry agents through Toolbox without manual endpoint configuration. Continued feature enhancement We prioritize based on feedback from the community raised in our GitHub repo. For example, support for streaming output and pagination are top items in our backlog today based on user demand. We also track the MCP spec's evolution closely and will continue shipping support for strategic features as they land. Examples of proposals we're following: MCP Tasks: the Tasks extension (SEP-2663) defines a standard pattern for async, long-running tool calls with durable task handles. This replaces hand-rolled polling patterns and aligns well with Functions' execute-and-return model. Stateless MCP: SEP-2575 proposes removing the mandatory initialization handshake, which is a natural fit for serverless platforms like Azure Functions where fresh instances can handle any request. Have something you'd like us to prioritize? Let us know by filing a request on GitHub. Get started Samples: Samples showcasing most up-to-date features: aka.ms/remote-mcp Documentation: Model Context Protocol for Azure Functions MCP Extension GitHub repo: Azure Functions MCP Extension511Views1like0CommentsAnnouncing Go support in Azure Functions (Preview)
We're excited to announce that Azure Functions now supports Go as a first-class language, available today in public preview on the Flex Consumption plan. Go developers can now build event-driven, serverless applications using idiomatic Go, the standard toolchain they already love, and the full breadth of Azure Functions triggers, bindings, and operational capabilities. TL;DR: Write Functions in Go using a new code-first programming model and SDK (azure-functions-golang-worker). Use triggers across HTTP, Timer, Service Bus, Event Hubs, Event Grid, Cosmos DB, and Blob Storage. Why Go on Azure Functions Go has become a default choice for cloud-native APIs, platform services, networking tools, and high-throughput integration workloads. Until now, teams that standardized on Go on Azure had to either: Use Azure Functions through the custom handlers protocol, missing out on a first-class developer experience, or Build and operate their own serving, scaling, and eventing infrastructure on containers or VMs. With first-class Go support, those teams get the productivity of Go plus the operational leverage of serverless: automatic scaling, pay-per-use billing, integrated triggers across the Azure ecosystem, and built-in observability, without leaving the Go ecosystem. What's in the preview A new Go programming model and SDK: a code-first, idiomatic way to register Functions and declare triggers using functional options. Support for popular triggers: HTTP, Timer, Service Bus (queues and topics), Event Hubs, Event Grid, Cosmos DB, and Blob Storage. More to come. Native Go build pipeline: go build produces a single static binary that the Functions host invokes directly. No function.json, no interop shims at request time. Integrated observability: Application Insights logging, metrics, and distributed tracing. End-to-end tooling: local development with a preview build of Azure Functions Core Tools. Deployment via Core Tools, zip deploy, or GitHub Actions. Flex Consumption: fast elastic scale, scale-to-zero, per-second billing, VNet integration, and always-ready instances. A quick look at the programming model The Go model is code-first: you register functions and declare triggers in Go, with compile-time checks and full IDE support. No separate JSON metadata to keep in sync. Here's a minimal HTTP-triggered Function: package main import ( "fmt" "net/http" "github.com/azure/azure-functions-golang-worker/sdk" "github.com/azure/azure-functions-golang-worker/worker" ) func main() { app := sdk.FunctionApp() app.HTTP("hello", hello, sdk.WithMethods("GET", "POST"), sdk.WithAuth("anonymous"), ) worker.Start(app) } func hello(w http.ResponseWriter, r *http.Request) { name := r.URL.Query().Get("name") if name == "" { name = "world" } fmt.Fprintf(w, "Hello, %s!", name) } Notice the HTTP handler is a plain http.HandlerFunc, the same signature you'd use with net/http or any Go web framework. There's nothing Functions-specific to learn at the handler level. Registering other triggers The same pattern works across triggers. Non-HTTP handlers take a context.Context plus a typed payload: import ( "context" "log" "github.com/azure/azure-functions-golang-worker/sdk" "github.com/azure/azure-functions-golang-worker/sdk/bindings" ) // Timer: runs every 10 seconds func onTimer(ctx context.Context, t bindings.TimerInfo) error { log.Printf("timer fired; past due=%v", t.IsPastDue) return nil } app.Timer("cleanup", onTimer, sdk.WithSchedule("*/10 * * * * *"), ) // Service Bus queue func onOrder(ctx context.Context, msg bindings.ServiceBusMessage) error { log.Printf("order %s: %s", msg.MessageId, string(msg.Body)) return nil } app.ServiceBusQueue("processOrder", onOrder, sdk.WithQueueName("orders"), sdk.WithConnection("ServiceBusConnection"), ) // Event Hubs func onEvent(ctx context.Context, e bindings.EventHubMessage) error { log.Printf("event seq=%d body=%s", e.SequenceNumber, string(e.Body)) return nil } app.EventHub("ingest", onEvent, sdk.WithEventHubName("telemetry"), sdk.WithConnection("EventHubConnection"), ) // Cosmos DB change feed func onChange(ctx context.Context, docs []bindings.CosmosDocument) error { for _, d := range docs { log.Printf("doc %s: %s", d.ID, string(d.Data)) } return nil } app.CosmosDB("onChange", onChange, sdk.WithDatabase("ToDoList"), sdk.WithContainer("Items"), sdk.WithConnection("CosmosDBConnection"), ) // Event Grid func onGridEvent(ctx context.Context, e bindings.EventGridEvent) error { log.Printf("%s: %s", e.EventType, e.Subject) return nil } app.EventGrid("onEvent", onGridEvent) Extension triggers: real Azure SDK clients For triggers like Blob Storage, the Go SDK injects a fully-typed Azure SDK client directly into your handler. You opt in with a blank import, so your binary only includes the extensions you actually use: import ( "context" "io" "log" "github.com/Azure/azure-sdk-for-go/sdk/storage/azblob/blob" "github.com/azure/azure-functions-golang-worker/sdk" _ "github.com/azure/azure-functions-golang-worker/triggers/blob" // registers blob trigger client factory "github.com/azure/azure-functions-golang-worker/worker" ) func onUpload(ctx context.Context, client *blob.Client) error { log.Printf("blob: %s", client.URL()) resp, err := client.DownloadStream(ctx, nil) if err != nil { return err } defer resp.Body.Close() data, err := io.ReadAll(resp.Body) if err != nil { return err } log.Printf("size=%d", len(data)) return nil } app.Blob("onUpload", onUpload, sdk.WithPath("uploads/{name}"), sdk.WithConnection("AzureWebJobsStorage"), sdk.WithSource("EventGrid"), ) The handler receives a *blob.Client from github.com/Azure/azure-sdk-for-go/sdk/storage/azblob/blob, the same client you'd use in any other Go app. Because it's a real SDK client, you can DownloadStream blobs of any size without buffering the whole payload through the worker. Dependencies stay isolated per extension, so apps that don't use Blob never pull in azblob or azidentity. Project layout A Go function app is just a regular Go module plus the standard Functions config files: my-function-app/ βββ host.json βββ local.settings.json βββ go.mod βββ go.sum βββ main.go No function.json and no generated metadata. Triggers are declared in main.go. go build, go test, and go mod tidy all just work. Get started Quickstart Install the preview tooling, scaffold your first Go function app, run it locally, and deploy to Flex Consumption. We can't wait to see what you build. Welcome to Functions, Gophers.400Views3likes0CommentsIntroducing On-demand Sandboxes for Azure Durable Task Scheduler (Private Preview)
Maybe it needs a native toolchain. Maybe it runs untrusted customer or LLM-generated code. Maybe it needs Python from a .NET orchestrator, or bursty compute that should scale to zero when the work is done. Today, we're thrilled to announce On-demand Sandboxes for Azure Durable Task Scheduler, now available in private preview. On-demand Sandboxes lets you move those individual workflow steps to managed, isolated compute while your orchestrator stays exactly where it is. Tell DTS which steps should run in isolation, provide a container image with the step code, and DTS handles provisioning, scaling, and teardown. No infrastructure to manage, no idle costs, no orchestrator changes. Sign up for On-demand Sandboxes Private Preview Today β Availability: On-demand Sandboxes targets the standalone Durable Task SDKs used outside the Azure Functions host β for apps running on Azure Container Apps, Azure Kubernetes Service, App Service, or anywhere else you self-host. The private preview supports the .NET and Python Durable Task SDKs, with additional language SDKs and Azure Functions support coming soon. What is Azure Durable Task Scheduler? The Durable Task Scheduler is a fully managed backend for durable execution on Azure. It can serve as the backend for a Durable Function App using the Durable Functions extension, or as the backend for an app leveraging the Durable Task SDKs in other compute environments, such as Azure Container Apps, Azure Kubernetes Service, or Azure App Service. For a deeper introduction, see the Durable Task Scheduler overview or the full Durable Task documentation. Why On-demand Sandboxes? Most activities belong in-process. They're fast, simple, and co-located with your orchestrator. But sometimes you hit a step that doesn't fit: it needs a native binary, a different language runtime, per-invocation isolation, or bursty compute you don't want to keep warm. On-demand Sandboxes gives you a way to handle those exceptions without spinning up dedicated infrastructure or managing scaling policies in Azure Kubernetes Service or Azure Container Apps. Activity-level granularity. Move individual steps to managed compute, not your whole app. Per-activity or per-invocation isolation. Each execution runs in a clean, microVM-backed sandbox. Ideal for untrusted code, customer plugins, or LLM-generated logic. Cross-runtime flexibility. Run a Python inference step from a .NET orchestrator. No compromise on either side. Scale-to-zero. Pay for CPU and memory per second of execution, not infrastructure that waits. No orchestrator changes. Your orchestration code and hosting model don't change at all. Here are a few scenarios where On-demand Sandboxes shines: Native toolchains. Package ffmpeg, LibreOffice, or Pandoc in a container without dragging them into your main app. CPU-heavy preprocessing. OCR, layout extraction, or image processing can scale independently of the rest of your workflow. Cross-runtime workflows. A .NET orchestrator dispatches a Python inference step. No compromises. Sandboxed code execution. Run customer plugins or LLM-generated code with a clean boundary on every invocation. Multi-tenant isolation. Tenant-specific steps get dedicated boundaries while everything else stays in-process. Bursty event-driven workloads. Steps that spike hard but rarely may not justify always-on infrastructure. Sub-second cold starts mean you get capacity when you need it without paying to keep it warm. How it works On-demand Sandboxes uses a two-part model: a worker profile in your orchestrator app that tells DTS which activities to offload, and a worker image that contains those activity implementations. Your orchestrator still calls activities the same way it always has; the decision to run one activity in a sandbox lives in the profile configuration. 1. Declare a sandbox worker profile In the app that hosts your orchestrator, define a sandbox worker profile. The profile gives DTS the container image, resource shape, concurrency setting, and activity names that should run in a sandbox: using Microsoft.DurableTask.Worker.AzureManaged.Sandbox; [SandboxWorkerProfile("code-executor")] internal sealed class CodeSandboxWorkerProfile : ISandboxWorkerProfile { public void Configure(SandboxOptions options) { options.ContainerImage = Environment.GetEnvironmentVariable("DTS_SANDBOX_IMAGE") ?? throw new InvalidOperationException("DTS_SANDBOX_IMAGE is required."); options.Cpu = "1000m"; options.Memory = "2048Mi"; options.MaxConcurrentActivities = 1; options.AddActivity(TaskNames.ExecuteCode); } } Then enable on-demand sandbox discovery when you configure the Durable Task worker in the main app: workerBuilder.AddTasks(tasks => tasks.AddAllGeneratedTasks()); workerBuilder.UseDurableTaskScheduler(options => { options.EndpointAddress = Environment.GetEnvironmentVariable("DTS_ENDPOINT"); options.TaskHubName = Environment.GetEnvironmentVariable("DTS_TASK_HUB"); options.Credential = credential; }); workerBuilder.EnableSandboxes(); Here's what the profile configuration does: SandboxWorkerProfile: a friendly profile id for this sandbox setup. It groups the activity, image, and resource settings for monitoring and reuse across deployments. ContainerImage: the container image (from your registry) that contains the activity implementations. Cpu / Memory: the resource shape for each worker instance. Sized per your activity's needs. MaxConcurrentActivities: how many activities a single worker instance can process concurrently. AddActivity: the specific activity to offload. Only activities added to a sandbox worker profile execute in DTS-managed isolated compute; everything else stays in-process. The orchestrator call site doesn't change: ExecuteCodeOutput execution = await context.CallActivityAsync<ExecuteCodeOutput>( TaskNames.ExecuteCode, new ExecuteCodeInput(pythonCode, input.CsvData)); ExecuteCode is not registered in the main app's in-process activity list. When the orchestrator calls it, DTS uses the codegen profile to route the work to the sandbox image. 2. Build the worker image The worker image is a container you own. In most apps, this worker lives in a separate project from the orchestrator host so it can have its own entry point, dependencies, and container image. It registers the activity implementations it can run and opts in to managed execution with UseSandboxWorker(): builder.Services.AddDurableTaskWorker(workerBuilder => { workerBuilder.AddTasks(tasks => { tasks.AddActivity<ExecuteCodeActivity>(); }); workerBuilder.UseSandboxWorker(); }); UseSandboxWorker() is the key line. It signals that this worker runs in DTS-managed compute. The sandbox worker does not need to configure the DTS endpoint, task hub, profile id, or credentials; DTS injects the runtime settings when it starts the container. The activity implementations themselves are standard Durable Task activities. There's nothing special about the activity code: it can call a runtime with different dependencies, such as Python and pandas, while running in an isolated container instead of in your main app's process. Package the image like any containerized service, including whatever runtimes and native tools the activity needs. Push it to your container registry (e.g., Azure Container Registry) and reference the image in the worker profile's ContainerImage option. View logs in the DTS dashboard Once your sandbox activities are running, you can view their execution logs directly in the Durable Task Scheduler dashboard. The dashboard shows real-time output from your managed workers, including stdout, stderr, and activity lifecycle events. This gives you full visibility into what's happening inside the sandbox without needing to configure external log sinks or set up your own observability pipeline. Demo Get started On-demand Sandboxes is in private preview. To get access, sign up here. We'll enable the feature on your scheduler and help you get your first sandbox activity running. Once you're in, the workflow is straightforward: declare a sandbox worker profile in your orchestrator app, build and push a worker image, and DTS takes care of the rest. Sign up for On-demand Sandboxes Private Preview Today β Documentation: Durable Task Scheduler overview Samples: Azure-Samples/Durable-Task-Scheduler Pricing: Azure Durable Task Scheduler pricing Questions, feedback, or ideas? Open an issue in the Durable-Task-Scheduler GitHub repo. We'd love to hear from you.513Views0likes0Comments