serverless
176 TopicsBulletproof agents with the durable task extension for Microsoft Agent Framework
Today, we're thrilled to announce the public preview of the durable task extension for Microsoft Agent Framework. This extension transforms how you build production-ready, resilient and scalable AI agents by bringing the proven durable execution (survives crashes and restarts) and distributed execution (runs across multiple instances) capabilities of Azure Durable Functions directly into the Microsoft Agent Framework. Now you can deploy stateful, resilient AI agents to Azure that automatically handle session management, failure recovery, and scaling, freeing you to focus entirely on your agent logic. Whether you're building customer service agents that maintain context across multi-day conversations, content pipelines with human-in-the-loop approval workflows, or fully automated multi-agent systems coordinating specialized AI models, the durable task extension gives you production-grade reliability, scalability and coordination with serverless simplicity. Key features of the durable task extension include: Serverless Hosting: Deploy agents on Azure Functions with auto-scaling from thousands of instances to zero, while retaining full control in a serverless architecture. Automatic Session Management: Agents maintain persistent sessions with full conversation context that survives process crashes, restarts, and distributed execution across instances Deterministic Multi-Agent Orchestrations: Coordinate specialized durable agents with predictable, repeatable, code-driven execution patterns Human-in-the-Loop with Serverless Cost Savings: Pause for human input without consuming compute resources or incurring costs Built-in Observability with Durable Task Scheduler: Deep visibility into agent operations and orchestrations through the Durable Task Scheduler UI dashboard Click here to create and run a durable agent # Python endpoint = os.getenv("AZURE_OPENAI_ENDPOINT") deployment_name = os.getenv("AZURE_OPENAI_DEPLOYMENT_NAME", "gpt-4o-mini") # Create an AI agent following the standard Microsoft Agent Framework pattern agent = AzureOpenAIChatClient( endpoint=endpoint, deployment_name=deployment_name, credential=AzureCliCredential() ).create_agent( instructions="""You are a professional content writer who creates engaging, well-structured documents for any given topic. When given a topic, you will: 1. Research the topic using the web search tool 2. Generate an outline for the document 3. Write a compelling document with proper formatting 4. Include relevant examples and citations""", name="DocumentPublisher", tools=[ AIFunctionFactory.Create(search_web), AIFunctionFactory.Create(generate_outline) ] ) # Configure the function app to host the agent with durable session management app = AgentFunctionApp(agents=[agent]) app.run() // C# var endpoint = Environment.GetEnvironmentVariable("AZURE_OPENAI_ENDPOINT"); var deploymentName = Environment.GetEnvironmentVariable("AZURE_OPENAI_DEPLOYMENT") ?? "gpt-4o-mini"; // Create an AI agent following the standard Microsoft Agent Framework pattern AIAgent agent = new AzureOpenAIClient(new Uri(endpoint), new DefaultAzureCredential()) .GetChatClient(deploymentName) .CreateAIAgent( instructions: """You are a professional content writer who creates engaging, well-structured documents for any given topic. When given a topic, you will: 1. Research the topic using the web search tool 2. Generate an outline for the document 3. Write a compelling document with proper formatting 4. Include relevant examples and citations""", name: "DocumentPublisher", tools: [ AIFunctionFactory.Create(SearchWeb), AIFunctionFactory.Create(GenerateOutline) ]); // Configure the function app to host the agent with durable thread management // This automatically creates HTTP endpoints and manages state persistence using IHost app = FunctionsApplication .CreateBuilder(args) .ConfigureFunctionsWebApplication() .ConfigureDurableAgents(options => options.AddAIAgent(agent) ) .Build(); app.Run(); Why the durable task extension? As AI agents evolve from simple chatbots to sophisticated systems handling complex, long-running tasks, new challenges emerge: Conversations span multiple days and weeks, requiring persistent state across process restarts, crashes, and disruptions. Tool calls might take longer than typical timeouts allow, needing automatic checkpointing and recovery. High-volume workloads require elastic scaling across distributed instances to handle thousands of concurrent agent conversations. Multiple specialized agents need coordination with predictable, repeatable execution for reliable business processes. Agents sometimes must wait for human approval before proceeding, ideally without consuming resources. The Durable Extension addresses these challenges by extending Microsoft Agent Framework with capabilities from Azure Durable Functions, enabling you to build AI agents that survive failures, scale elastically, and execute predictably through durable and distributed execution. The extension is built on four foundational value pillars, which we refer to as the 4D’s: Durability Every agent state change (messages, tool calls, decisions) is durably checkpointed automatically. Agents survive and automatically resume from infrastructure updates, crashes, and can be unloaded from memory during long waiting periods without losing context. This is essential for agents that orchestrate long-running operations or wait for external events. Distributed Agent execution is accessible across all instances, enabling elastic scaling and automatic failover. Healthy nodes seamlessly take over work from failed instances, ensuring continuous operation. This distributed execution model allows thousands of stateful agents to scale up and run in parallel. Deterministic Agent orchestrations execute predictably using imperative logic written as ordinary code. Define the execution path, enabling automated testing, verifiable guardrails, and business-critical workflows that stakeholders can trust. This complements agent-directed workflows by providing explicit control flow when needed. Debuggability Use familiar development tools (IDEs, debuggers, breakpoints, stack traces, and unit tests) and programming languages to develop and debug. Your agent and agent orchestrations are expressed as code, making them easily testable, debuggable, and maintainable. Features in action Serverless hosting Deploy agents to Azure Functions (with expansion to other Azure computes soon) with automatic scaling to thousands of instances or down to zero when not in use. Pay only for the compute resources you consume. This code-first deployment approach gives you full control over the compute environment while maintaining the benefits of a serverless architecture. # Python endpoint = os.getenv("AZURE_OPENAI_ENDPOINT") deployment_name = os.getenv("AZURE_OPENAI_DEPLOYMENT_NAME", "gpt-4o-mini") # Create an AI agent following the standard Microsoft Agent Framework pattern agent = AzureOpenAIChatClient( endpoint=endpoint, deployment_name=deployment_name, credential=AzureCliCredential() ).create_agent( instructions="""You are a professional content writer who creates engaging, well-structured documents for any given topic. When given a topic, you will: 1. Research the topic using the web search tool 2. Generate an outline for the document 3. Write a compelling document with proper formatting 4. Include relevant examples and citations""", name="DocumentPublisher", tools=[ AIFunctionFactory.Create(search_web), AIFunctionFactory.Create(generate_outline) ] ) # Configure the function app to host the agent with durable session management app = AgentFunctionApp(agents=[agent]) app.run() Automatic session management Agent sessions are automatically checkpointed in durable storage that you configure in your function app, enabling durable and distributed execution across multiple instances. Any instance can resume an agent's execution after interruptions or process failures, ensuring continuous operation. Under the hood, agents are implemented as durable entities. These are stateful objects that maintain their state across executions. This architecture enables each agent session to function as a reliable, long-lived entity with preserved conversation history and context. Example scenario: A customer service agent handling a complex support case over multiple days and weeks. The conversation history, context, and progress are preserved even if the agent is redeployed or moves to a different instance. # First interaction - start a new thread to create a document curl -X POST https://your-function-app.azurewebsites.net/api/agents/DocumentPublisher/threads \ -H "Content-Type: application/json" \ -d '{"message": "Create a document about the benefits of Azure Functions"}' # Response includes thread ID and initial document outline/draft # {"threadId": "doc789", "response": "I'll create a comprehensive document about Azure Functions benefits. Let me search for the latest information... [Document Draft] # Benefits of Azure Functions\n\n## Introduction\nAzure Functions is a serverless compute service that enables you to run event-driven code without managing infrastructure...\n\n## Cost Efficiency\n- Pay only for execution time\n- No charges for idle resources\n- Automatic scaling reduces over-provisioning...\n\n## Developer Productivity\n- Multiple language support (C#, Python, JavaScript, Java)\n- Integrated development tools and CI/CD...\n\n## Scalability\n- Automatic scaling based on demand\n- Handle millions of requests seamlessly...\n\nReferences: [Azure documentation, serverless computing studies]"} # Second interaction - continue the same thread to refine the document curl -X POST https://your-function-app.azurewebsites.net/api/agents/DocumentPublisher/threads/doc789 \ -H "Content-Type: application/json" \ -d '{"message": "Can you add a section about integration with other Azure services?"}' # Agent remembers the Azure Functions document context and adds the requested section # {"threadId": "doc789", "response": "I've added a comprehensive integration section to your Azure Functions document:\n\n## Integration with Azure Services\n\n### Azure Storage\nSeamless triggers and bindings for Blob Storage, Queue Storage, and Table Storage enable event-driven architectures...\n\n### Azure Event Grid & Event Hubs\nProcess real-time event streams and implement publish-subscribe patterns at scale...\n\n### Azure Cosmos DB\nBuilt-in bindings for document database operations with automatic change feed processing...\n\n### Azure Service Bus\nReliable message processing with enterprise messaging features...\n\n### Azure AI Services\nEasily integrate OpenAI, Cognitive Services, and AI Search for intelligent applications...\n\nThe document now includes this section after the Scalability section. Would you like me to add use cases or deployment best practices?"} Deterministic multi-agent orchestrations Coordinate multiple specialized durable agents using imperative code where you define the control flow. This differs from agent-directed workflows where the agent decides the next steps. Deterministic Orchestrations provide predictable, repeatable execution patterns with automatic checkpointing and recovery. Example scenario: An email processing system that uses a spam detection agent, then conditionally routes to different specialized agents based on the classification. The orchestration automatically recovers if any step fails and completed agent calls are not re-executed. # Python app.orchestration_trigger(context_name="context") def document_publishing_orchestration(context: DurableOrchestrationContext): """Deterministic orchestration coordinating multiple specialized agents.""" doc_request = context.get_input() # Get specialized agents from the orchestration context research_agent = context.get_agent("ResearchAgent") writer_agent = context.get_agent("DocumentPublisherAgent") # Step 1: Research the topic using web search research_result = yield research_agent.run( messages=f"Research the following topic and gather key information: {doc_request.topic}", response_schema=ResearchResult ) # Step 2: Generate outline based on research findings outline = yield context.call_activity("generate_outline", { "topic": doc_request.topic, "research_data": research_result.findings }) # Step 3: Write the document with the research and outline document = yield writer_agent.run( messages=f"""Create a comprehensive document about {doc_request.topic}. Research findings: {research_result.findings} Outline: {outline} Write a well-structured, engaging document with proper formatting and citations.""", response_schema=DocumentResponse ) # Step 4: Save and publish the generated document return yield context.call_activity("publish_document", { "title": doc_request.topic, "content": document.text, "citations": document.citations }) Human-in-the-loop Orchestrations and agents can pause for human input, approval, or review without consuming compute resources. Durable execution enables orchestrations to wait for days or even weeks while waiting for human responses, even if the app crashes or restarts. When combined with serverless hosting, all compute resources are spun down during the wait period, eliminating compute costs until the human provides their input. Example scenario: A content publishing agent that generates drafts, sends them to human reviewers, and waits days for approval without running (or paying for) compute resources during the review period. When the human response arrives, the orchestration automatically resumes with full conversation context and execution state intact. # Python app.orchestration_trigger(context_name="context") def content_approval_workflow(context: DurableOrchestrationContext): """Human-in-the-loop workflow with zero-cost waiting.""" topic = context.get_input() # Step 1: Generate content using an agent content_agent = context.get_agent("ContentGenerationAgent") draft_content = yield content_agent.run(f"Write an article about {topic}") # Step 2: Send for human review yield context.call_activity("notify_reviewer", draft_content) # Step 3: Wait for approval - no compute resources consumed while waiting approval_event = context.wait_for_external_event("ApprovalDecision") timeout_task = context.create_timer(context.current_utc_datetime + timedelta(hours=24)) winner = yield context.task_any([approval_event, timeout_task]) if winner == approval_event: timeout_task.cancel() approved = approval_event.result if approved: result = yield context.call_activity("publish_content", draft_content) return result else: return "Content rejected" else: # Timeout - escalate for review result = yield context.call_activity("escalate_for_review", draft_content) return result Built-in agent observability Configure your Function App with the Durable Task Scheduler as the durable backend (what persists agents and orchestration state). The Durable Task Scheduler is the recommended durable backend for your durable agents, offering the best throughput performance, fully managed infrastructure, and built-in observability through a UI dashboard. The Durable Task Scheduler dashboard provides deep visibility into your agent operations: Conversation history: View complete conversation threads for each agent session, including all messages, tool calls, and conversation context at any point in time Multi-agent visualization: See the execution flow when calling multiple specialized agents with visual representation of agent handoffs, parallel executions, and conditional branching Performance metrics: Monitor agent response times, token usage, and orchestration duration Execution history: Access detailed execution logs with full replay capability for debugging Demo Video Language support The Durable Extension supports: C# (.NET 8.0+) with Azure Functions Python (3.10+) with Azure Functions Support for additional computes coming soon. Get started today Click here to create and run a durable agent Learn more Overview documentation C# Samples Python Samples5.4KViews2likes7CommentsHost remote MCP servers on Azure Functions
Model Context Protocol (MCP) servers allow AI agents to access external tools, data, and systems, greatly extending the capability and power of agents. When you’re ready to expose your MCP servers externally, within your organization or to the world, it’s important that the servers are run in a secure, scalable, and reliable environment. Azure Functions provides such a robust platform for hosting your remote MCP servers, offering high scalability with the Flex Consumption plan, built‑in authentication feature for Microsoft Entra and OAuth, and a serverless billing model. The platform also offers two hosting options for added flexibility and convenience. The options allow for hosting of MCP servers built with Azure Functions MCP extension or the official MCP SDKs. Azure Functions MCP Extension (GA) The MCP extension allows you to build and host servers using Azure Functions programming model, i.e. using triggers and bindings. The MCP tool trigger allows you to focus on implementing tools you want to expose, instead of worrying about handling protocol and server logistics. The MCP extension launched as public preview back in April and is now generally available, with support for .NET, Java, JavaScript, Python, and Typescript. New features in the extension Support for streamable-http transport Support for the newer streamable-http transport is added to the extension. Unless your client specifically requires the older Server-Sent Events (SSE) transport, you should use the streamable-http. The two transports have different endpoints in the extension: Transport Endpoint Streamable HTTP /runtime/webhooks/mcp Server-Sent Events (SSE) /runtime/webhooks/mcp/sse Defining server information You can use the extensions.mcp section in host.json to define MCP server information. { "version": "2.0", "extensions": { "mcp": { "instructions": "Some test instructions on how to use the server", "serverName": "TestServer", "serverVersion": "2.0.0", "encryptClientState": true, "messageOptions": { "useAbsoluteUriForEndpoint": false }, "system": { "webhookAuthorizationLevel": "System" } } } } Built-in server authentication and authorization The built-in feature implements the requirements of the MCP authorization protocol, such as issuing 401 challenge and hosting the Protected Resource Metadata document. You can configure it to use identity providers like Microsoft Entra for server authentication. In addition to server authenticating, you can also leverage this feature to implement on-behalf-of (OBO) auth flows where the client invokes a tool that accesses some downstream services on-behalf-of the user. Learn more about the built-in authentication and authorization feature. Mavin Build Plugin for Java For Java applications, the Maven Build Plugin (version 1.40.0) parses and verifies MCP tool annotations during build time. This process automatically generates the correct MCP extension configuration, ensuring that the MCP tool defined by the user is properly set up. The build-time analysis is especially beneficial for Java apps, as it allows developers to utilize the MCP extension without concerns about increased cold start times. We'll continuously enhance the plugin’s capabilities. Upcoming improvements, such as property type inference, will reduce manual configuration and make it even easier to use the McpToolTrigger. Get started Checkout the quickstarts to get an MCP extension server deployed in minutes: C# (.NET) remote-mcp-functions-dotnet Python remote-mcp-functions-python TypeScript (Node.js) remote-mcp-functions-typescript Java remote-mcp-functions-java References Learn more about the MCP extension and tool trigger in official documentations. Self‑hosted MCP server (public preview) In addition to the MCP extension, Azure Functions also supports hosting MCP servers implemented with the official SDKs. This is a suitable option for teams that have existing SDK‑based servers or who favor the SDK experience over the Functions programming model. There is no need to modify your server code; you can lift and shift these MCP servers to Azure Functions— which is why they are termed self‑hosted. The hosting capability supports the following features: Stateless servers that use the streamable-http transport. If you need your server to be stateful, consider using the Functions MCP extension for now. Servers implemented with Python, TypeScript, C#, or Java MCP SDK. Built-in server authentication and authorization like the MCP extension Hosting requirement Self-hosted MCP servers are deployed to the Azure Functions platform as custom handlers. You can think of custom handlers as lightweight web servers that receive events from the Functions host. The only requirement for hosting the MCP server is a file called host.json. Add this file to your project root to tell Functions how to run the server. An example host.json for a Python server looks like: { "version": "2.0", "configurationProfile": "mcp-custom-handler", "customHandler": { "description": { "defaultExecutablePath": "python", "arguments": ["path to main python script, e.g. hello.py"] }, "port": "8000" } } Get started Check out quickstarts to get your self-hosted MCP server deployed in minutes: C# (.NET) mcp-sdk-functions-hosting-dotnet Python mcp-sdk-functions-hosting-python TypeScript (Node.js) mcp-sdk-functions-hosting-node Java Coming soon! References Read the official documentation of self-hosted MCP servers and learn about integrations with Azure services like Foundry and API Center. For .NET developers - check out the overview of self-hosted MCP servers from the recent .NET Conference! We’d love to hear from you! Let us know your thoughts about hosting remote MCP server on Azure Functions. Does either of the options meet your needs? What other MCP features are you looking for? Let us know what you’d like us to prioritize next!426Views2likes0CommentsAgentic Applications on Azure Container Apps with Microsoft Foundry
Agents have exploded in popularity over the last year, reshaping not only the kinds of applications developers build but also the underlying architectures required to run them. As agentic applications grow more complex by invoking tools, collaborating with other services, and orchestrating multi-step workflows, architectures are naturally shifting toward microservice patterns. Azure Container Apps is purpose-built for this world: a fully managed, serverless platform designed to run independent, composable services with autoscaling, pay-per-second pricing, and seamless app-to-app communication. By combining Azure Container Apps with the Microsoft Agent Framework (MAF) and Microsoft Foundry, developers can run containerized agents on ACA while using Foundry to visualize and monitor how those agents behave. Azure Container Apps handles scalable, high-performance execution of agent logic, and Microsoft Foundry lights up rich observability for reasoning, planning, tool calls, and errors through its integrated monitoring experience. Together, they form a powerful foundation for building and operating modern, production-grade agentic applications. In this blog, we’ll walk through how to build an agent running on Azure Container Apps using Microsoft Agent Framework and OpenTelemetry, and then connect its telemetry to Microsoft Foundry so you can see your ACA-hosted agent directly in the Foundry monitoring experience. Prerequisites An Azure account with an active subscription. If you don't have one, you can create one for free. Ensure you have a Microsoft Foundry project setup. If you don’t already have one, you can create a project from the Azure AI Foundry portal. Azure Developer CLI (azd) installed Git installed The Sample Agent The complete sample code is available in this repo and can be deployed end-to-end with a single command. It's a basic currency agent. This sample deploys: An Azure Container Apps environment An agent built with Microsoft Agent Framework (MAF) OpenTelemetry instrumentation using the Azure Monitor exporter Application Insights to collect agent telemetry A Microsoft Foundry resource Environment wiring to integrate the agent with Microsoft Foundry Deployment Steps Clone the repository: git clone https://github.com/cachai2/foundry-3p-agents-samples.git cd foundry-3p-agents-samples/azure Authenticate with Azure: azd auth login Set the following azd environment variable azd env set AZURE_AI_MODEL_DEPLOYMENT_NAME gpt-4.1-mini Deploy to Azure azd up This provisions your Azure Container App, Application Insights, logs pipeline and required environment variables. While deployment runs, let’s break down how the code becomes compatible with Microsoft Foundry. 1. Setting up the agent in Azure Container Apps To integrate with Microsoft Foundry, the agent needs two essential capabilities: Microsoft Agent Framework (MAF): Handles agent logic, tools, schema-driven execution, and emits standardized gen_ai.* spans. OpenTelemetry: Sends the required agent/model/tool spans to Application Insights, which Microsoft Foundry consumes for visualization and monitoring. Although this sample uses MAF, the same pattern works with any agent framework. MAF and LangChain currently provide the richest telemetry support out-of-the-box. 1.1 Configure Microsoft Agent Framework (MAF) The agent includes: A tool (get_exchange_rate) An agent created by ChatAgent A runtime manager (AgentRuntime) A FastAPI app exposing /invoke Telemetry is enabled using two components already present in the repo: configure_azure_monitor: Configures OpenTelemetry + Azure Monitor exporter + auto-instrumentation. setup_observability(): Enables MAF’s additional spans (gen_ai.*, tool spans, agent lifecycle spans). From the repo (_configure_observability()): from azure.monitor.opentelemetry import configure_azure_monitor from agent_framework.observability import setup_observability from opentelemetry.sdk.resources import Resource def _configure_observability() -> None: configure_azure_monitor( resource=Resource.create({"service.name": SERVICE_NAME}), connection_string=APPLICATION_INSIGHTS_CONNECTION_STRING, ) setup_observability(enable_sensitive_data=False) This gives you: gen_ai.model.* spans (model usage + token counts) tool call spans agent lifecycle & execution spans HTTP + FastAPI instrumentation Standardized telemetry required by Microsoft Foundry No manual TracerProvider wiring or OTLP exporter setup is needed. 1.2 OpenTelemetry Setup (Azure Monitor Exporter) In this sample, OpenTelemetry is fully configured by Azure Monitor’s helper: import os from azure.monitor.opentelemetry import configure_azure_monitor from opentelemetry.sdk.resources import Resource from agent_framework.observability import setup_observability SERVICE_NAME = os.getenv("ACA_SERVICE_NAME", "aca-currency-exchange-agent") configure_azure_monitor( resource=Resource.create({"service.name": SERVICE_NAME}), connection_string=os.getenv("APPLICATION_INSIGHTS_CONNECTION_STRING"), ) # Enable Microsoft Agent Framework gen_ai/tool spans on top of OTEL setup_observability(enable_sensitive_data=False) This automatically: Installs and configures the OTEL tracer provider Enables batching + exporting of spans Adds HTTP/FastAPI/Requests auto-instrumentation Sends telemetry to Application Insights Adds MAF’s agent + tool spans All required environment variables (such as APPLICATION_INSIGHTS_CONNECTION_STRING) are injected automatically by azd up. 2. Deploy the Model and Test Your Agent Once azd up completes, you're ready to deploy a model to the Microsoft Foundry instance and test it. Find the resource name of your deployed Azure AI Services from azd up and navigate to it. From there, open it in Microsoft Foundry, navigate to the Model Catalog and add the gpt-4.1-mini model. Find the resource name of your deployed Azure Container App and navigate to it. Copy the application URL Set your container app URL environment variable in your terminal. (The below commands are for WSL.) export APP_URL="Your container app URL" Now, go back to your terminal and run the following curl command to invoke the agent curl -X POST "$APP_URL/invoke" \ -H "Content-Type: application/json" \ -d '{ "prompt": "How do I convert 100 USD to EUR?" }' 3. Verifying Telemetry to Application Insights Once your Container App starts, you can validate telemetry: Open the Application Insights resource created by azd up Go to Logs Run these queries (make sure you're in KQL mode not simple mode) Check MAF-genAI spans: dependencies | where timestamp > ago(3h) | extend genOp = tostring(customDimensions["gen_ai.operation.name"]), genSys = tostring(customDimensions["gen_ai.system"]), reqModel = tostring(customDimensions["gen_ai.request.model"]), resModel = tostring(customDimensions["gen_ai.response.model"]) | summarize count() by genOp, genSys, reqModel, resModel | order by count_ desc Check agent + tools: dependencies | where timestamp > ago(1h) | extend genOp = tostring(customDimensions["gen_ai.operation.name"]), agent = tostring(customDimensions["gen_ai.agent.name"]), tool = tostring(customDimensions["gen_ai.tool.name"]) | where genOp in ("agent.run", "invoke_agent", "execute_tool") | project timestamp, genOp, agent, tool, name, target, customDimensions | order by timestamp desc If telemetry is flowing, you’re ready to plug your agent into Microsoft Foundry. 4. Connect Application Insights to Microsoft Foundry Microsoft Foundry uses your Application Insights resource to power: Agent monitoring Tool call traces Reasoning graphs Multi-agent orchestration views Error analysis To connect: Navigate to Monitoring in the left navigation pane of the Microsoft Foundry portal. Select the Application analytics tab. Select your application insights resource created from azd up Connect the resource to your AI Foundry project. Note: If you are unable to add your application insights connection this way, you may need to follow the following: Navigate to the Overview of your Foundry project -> Open in management center -> Connected resources -> New Connection -> Application Insights Foundry will automatically start ingesting: gen_ai.* spans tool spans agent lifecycle spans workflow traces No additional configuration is required. 5. Viewing Dashboards & Traces in Microsoft Foundry Once your Application Insights connection is added, you can view your agent’s telemetry directly in Microsoft Foundry’s Monitoring experience. 5.1 Monitoring The Monitoring tab shows high-level operational metrics for your application, including: Total inference calls Average call duration Overall success/error rate Token usage (when available) Traffic trends over time This view is useful for spotting latency spikes, increased load, or changes in usage patterns, and these visualizations are powered by the telemetry emitting from your agents in Azure Container Apps. 5.2 Traces Timeline The Tracing tab shows the full distributed trace of each agent request, including all spans emitted by Microsoft Foundry and your Azure Container App with Microsoft Agent Framework. You can see: Top-level operations such as invoke_agent, chat, and process_thread_run Tool calls like execute_tool_get_exchange_rate Internal MAF steps (create_thread, create_message, run tool) Azure credential calls (GET /msi/token) Input/output tokens and duration for each span This view gives you an end-to-end breakdown of how your agent executed, which tools it invoked, and how long each step took — essential for debugging and performance tuning. Conclusion By combining Azure Container Apps, the Microsoft Agent Framework, and OpenTelemetry, you can build agents that are not only scalable and production-ready, but also fully observable and orchestratable inside Microsoft Foundry. Container Apps provides the execution engine and autoscaling foundation, MAF supplies structured agent logic and telemetry, and Microsoft Foundry ties everything together with powerful planning, monitoring, and workflow visualization. This architecture gives you the best of both worlds: the flexibility of running your own containerized agents with the dependencies you choose, and the intelligence of Microsoft Foundry to coordinate multi-step reasoning, tool call, and cross-agent workflows. As the agent ecosystem continues to evolve, Azure Container Apps and Microsoft Foundry provide a strong, extensible foundation for building the next generation of intelligent, microservice-driven applications.323Views1like0CommentsAzure Functions Ignite 2025 Update
Azure Functions is redefining event-driven applications and high-scale APIs in 2025, accelerating innovation for developers building the next generation of intelligent, resilient, and scalable workloads. This year, our focus has been on empowering AI and agentic scenarios: remote MCP server hosting, bulletproofing agents with Durable Functions, and first-class support for critical technologies like OpenTelemetry, .NET 10 and Aspire. With major advances in serverless Flex Consumption, enhanced performance, security, and deployment fundamentals across Elastic Premium and Flex, Azure Functions is the platform of choice for building modern, enterprise-grade solutions. Remote MCP Model Context Protocol (MCP) has taken the world by storm, offering an agent a mechanism to discover and work deeply with the capabilities and context of tools. When you want to expose MCP/tools to your enterprise or the world securely, we recommend you think deeply about building remote MCP servers that are designed to run securely at scale. Azure Functions is uniquely optimized to run your MCP servers at scale, offering serverless and highly scalable features of Flex Consumption plan, plus two flexible programming model options discussed below. All come together using the hardened Functions service plus new authentication modes for Entra and OAuth using Built-in authentication. Remote MCP Triggers and Bindings Extension GA Back in April, we shared a new extension that allows you to author MCP servers using functions with the MCP tool trigger. That MCP extension is now generally available, with support for C#(.NET), Java, JavaScript (Node.js), Python, and Typescript (Node.js). The MCP tool trigger allows you to focus on what matters most: the logic of the tool you want to expose to agents. Functions will take care of all the protocol and server logistics, with the ability to scale out to support as many sessions as you want to throw at it. [Function(nameof(GetSnippet))] public object GetSnippet( [McpToolTrigger(GetSnippetToolName, GetSnippetToolDescription)] ToolInvocationContext context, [BlobInput(BlobPath)] string snippetContent ) { return snippetContent; } New: Self-hosted MCP Server (Preview) If you’ve built servers with official MCP SDKs and want to run them as remote cloud‑scale servers without re‑writing any code, this public preview is for you. You can now self‑host your MCP server on Azure Functions—keep your existing Python, TypeScript, .NET, or Java code and get rapid 0 to N scaling, built-in server authentication and authorization, consumption-based billing, and more from the underlying Azure Functions service. This feature complements the Azure Functions MCP extension for building MCP servers using the Functions programming model (triggers & bindings). Pick the path that fits your scenario—build with the extension or standard MCP SDKs. Either way you benefit from the same scalable, secure, and serverless platform. Use the official MCP SDKs: # MCP.tool() async def get_alerts(state: str) -> str: """Get weather alerts for a US state. Args: state: Two-letter US state code (e.g. CA, NY) """ url = f"{NWS_API_BASE}/alerts/active/area/{state}" data = await make_nws_request(url) if not data or "features" not in data: return "Unable to fetch alerts or no alerts found." if not data["features"]: return "No active alerts for this state." alerts = [format_alert(feature) for feature in data["features"]] return "\n---\n".join(alerts) Use Azure Functions Flex Consumption Plan's serverless compute using Custom Handlers in host.json: { "version": "2.0", "configurationProfile": "mcp-custom-handler", "customHandler": { "description": { "defaultExecutablePath": "python", "arguments": ["weather.py"] }, "http": { "DefaultAuthorizationLevel": "anonymous" }, "port": "8000" } } Learn more about MCPTrigger and self-hosted MCP servers at https://aka.ms/remote-mcp Built-in MCP server authorization (Preview) The built-in authentication and authorization feature can now be used for MCP server authorization, using a new preview option. You can quickly define identity-based access control for your MCP servers with Microsoft Entra ID or other OpenID Connect providers. Learn more at https://aka.ms/functions-mcp-server-authorization. Better together with Foundry agents Microsoft Foundry is the starting point for building intelligent agents, and Azure Functions is the natural next step for extending those agents with remote MCP tools. Running your tools on Functions gives you clean separation of concerns, reuse across multiple agents, and strong security isolation. And with built-in authorization, Functions enables enterprise-ready authentication patterns, from calling downstream services with the agent’s identity to operating on behalf of end users with their delegated permissions. Build your first remote MCP server and connect it to your Foundry agent at https://aka.ms/foundry-functions-mcp-tutorial. Agents Microsoft Agent Framework 2.0 (Public Preview Refresh) We’re excited about the preview refresh 2.0 release of Microsoft Agent Framework that builds on battle hardened work from Semantic Kernel and AutoGen. Agent Framework is an outstanding solution for building multi-agent orchestrations that are both simple and powerful. Azure Functions is a strong fit to host Agent Framework with the service’s extreme scale, serverless billing, and enterprise grade features like VNET networking and built-in auth. Durable Task Extension for Microsoft Agent Framework (Preview) The durable task extension for Microsoft Agent Framework transforms how you build production-ready, resilient and scalable AI agents by bringing the proven durable execution (survives crashes and restarts) and distributed execution (runs across multiple instances) capabilities of Azure Durable Functions directly into the Microsoft Agent Framework. Combined with Azure Functions for hosting and event-driven execution, you can now deploy stateful, resilient AI agents that automatically handle session management, failure recovery, and scaling, freeing you to focus entirely on your agent logic. Key features of the durable task extension include: Serverless Hosting: Deploy agents on Azure Functions with auto-scaling from thousands of instances to zero, while retaining full control in a serverless architecture. Automatic Session Management: Agents maintain persistent sessions with full conversation context that survives process crashes, restarts, and distributed execution across instances Deterministic Multi-Agent Orchestrations: Coordinate specialized durable agents with predictable, repeatable, code-driven execution patterns Human-in-the-Loop with Serverless Cost Savings: Pause for human input without consuming compute resources or incurring costs Built-in Observability with Durable Task Scheduler: Deep visibility into agent operations and orchestrations through the Durable Task Scheduler UI dashboard Create a durable agent: endpoint = os.getenv("AZURE_OPENAI_ENDPOINT") deployment_name = os.getenv("AZURE_OPENAI_DEPLOYMENT_NAME", "gpt-4o-mini") # Create an AI agent following the standard Microsoft Agent Framework pattern agent = AzureOpenAIChatClient( endpoint=endpoint, deployment_name=deployment_name, credential=AzureCliCredential() ).create_agent( instructions="""You are a professional content writer who creates engaging, well-structured documents for any given topic. When given a topic, you will: 1. Research the topic using the web search tool 2. Generate an outline for the document 3. Write a compelling document with proper formatting 4. Include relevant examples and citations""", name="DocumentPublisher", tools=[ AIFunctionFactory.Create(search_web), AIFunctionFactory.Create(generate_outline) ] ) # Configure the function app to host the agent with durable session management app = AgentFunctionApp(agents=[agent]) app.run() Durable Task Scheduler dashboard for agent and agent workflow observability and debugging For more information on the durable task extension for Agent Framework, see the announcement: https://aka.ms/durable-extension-for-af-blog. Flex Consumption Updates As you know, Flex Consumption means serverless without compromise. It combines elastic scale and pay‑for‑what‑you‑use pricing with the controls you expect: per‑instance concurrency, longer executions, VNet/private networking, and Always Ready instances to minimize cold starts. Since launching GA at Ignite 2024 last year, Flex Consumption has had tremendous growth with over 1.5 billion function executions per day and nearly 40 thousand apps. Here’s what’s new for Ignite 2025: 512 MB instance size (GA). Right‑size lighter workloads, scale farther within default quota. Availability Zones (GA). Distribute instances across zones. Rolling updates (Public Preview). Unlock zero-downtime deployments of code or config by setting a single configuration. See below for more information. Even more improvements including: new diagnostic settingsto route logs/metrics, use Key Vault App Config references, new regions, and Custom Handler support. To get started, review Flex Consumption samples, or dive into the documentation to see how Flex can support your workloads. Migrating to Azure Functions Flex Consumption Migrating to Flex Consumption is simple with our step-by-step guides and agentic tools. Move your Azure Functions apps or AWS Lambda workloads, update your code and configuration, and take advantage of new automation tools. With Linux Consumption retiring, now is the time to switch. For more information, see: Migrate Consumption plan apps to the Flex Consumption plan Migrate AWS Lambda workloads to Azure Functions Durable Functions Durable Functions introduces powerful new features to help you build resilient, production-ready workflows: Distributed Tracing: lets you track requests across components and systems, giving you deep visibility into orchestration and activities with support for App Insights and OpenTelemetry. Extended Sessions support in .NET isolated: improves performance by caching orchestrations in memory, ideal for fast sequential activities and large fan-out/fan-in patterns. Orchestration versioning (public preview): enables zero-downtime deployments and backward compatibility, so you can safely roll out changes without disrupting in-flight workflows Durable Task Scheduler Updates Durable Task Scheduler Dedicated SKU (GA): Now generally available, the Dedicated SKU offers advanced orchestration for complex workflows and intelligent apps. It provides predictable pricing for steady workloads, automatic checkpointing, state protection, and advanced monitoring for resilient, reliable execution. Durable Task Scheduler Consumption SKU (Public Preview): The new Consumption SKU brings serverless, pay-as-you-go orchestration to dynamic and variable workloads. It delivers the same orchestration capabilities with flexible billing, making it easy to scale intelligent applications as needed. For more information see: https://aka.ms/dts-ga-blog OpenTelemetry support in GA Azure Functions OpenTelemetry is now generally available, bringing unified, production-ready observability to serverless applications. Developers can now export logs, traces, and metrics using open standards—enabling consistent monitoring and troubleshooting across every workload. Key capabilities include: Unified observability: Standardize logs, traces, and metrics across all your serverless workloads for consistent monitoring and troubleshooting. Vendor-neutral telemetry: Integrate seamlessly with Azure Monitor or any OpenTelemetry-compliant backend, ensuring flexibility and choice. Broad language support: Works with .NET (isolated), Java, JavaScript, Python, PowerShell, and TypeScript. Start using OpenTelemetry in Azure Functions today to unlock standards-based observability for your apps. For step-by-step guidance on enabling OpenTelemetry and configuring exporters for your preferred backend, see the documentation. Deployment with Rolling Updates (Preview) Achieving zero-downtime deployments has never been easier. The Flex Consumption plan now offers rolling updates as a site update strategy. Set a single property, and all future code deployments and configuration changes will be released with zero-downtime. Instead of restarting all instances at once, the platform now drains existing instances in batches while scaling out the latest version to match real-time demand. This ensures uninterrupted in-flight executions and resilient throughput across your HTTP, non-HTTP, and Durable workloads – even during intensive scale-out scenarios. Rolling updates are now in public preview. Learn more at https://aka.ms/functions/rolling-updates. Secure Identity and Networking Everywhere By Design Security and trust are paramount. Azure Functions incorporates proven best practices by design, with full support for managed identity—eliminating secrets and simplifying secure authentication and authorization. Flex Consumption and other plans offer enterprise-grade networking features like VNETs, private endpoints, and NAT gateways for deep protection. The Azure Portal streamlines secure function creation, and updated scenarios and samples showcase these identity and networking capabilities in action. Built-in authentication (discussed above) enables inbound client traffic to use identity as well. Check out our updated Functions Scenarios page with quickstarts or our secure samples gallery to see these identity and networking best practices in action. .NET 10 Azure Functions now supports .NET 10, bringing in a great suite of new features and performance benefits for your code. .NET 10 is supported on the isolated worker model, and it’s available for all plan types except Linux Consumption. As a reminder, support ends for the legacy in-process model on November 10, 2026, and the in-process model is not being updated with .NET 10. To stay supported and take advantage of the latest features, migrate to the isolated worker model. Aspire Aspire is an opinionated stack that simplifies development of distributed applications in the cloud. The Azure Functions integration for Aspire enables you to develop, debug, and orchestrate an Azure Functions .NET project as part of an Aspire solution. Aspire publish directly deploys to your functions to Azure Functions on Azure Container Apps. Aspire 13 includes an updated preview version of the Functions integration that acts as a release candidate with go-live support. The package will be moved to GA quality with Aspire 13.1. Java 25, Node.js 24 Azure Functions now supports Java 25 and Node.js 24 in preview. You can now develop functions using these versions locally and deploy them to Azure Functions plans. Learn how to upgrade your apps to these versions here In Summary Ready to build what’s next? Update your Azure Functions Core Tools today and explore the latest samples and quickstarts to unlock new capabilities for your scenarios. The guided quickstarts run and deploy in under 5 minutes, and incorporate best practices—from architecture to security to deployment. We’ve made it easier than ever to scaffold, deploy, and scale real-world solutions with confidence. The future of intelligent, scalable, and secure applications starts now—jump in and see what you can create!1.3KViews0likes0CommentsWhat's new in Azure Container Apps at Ignite'25
Azure Container Apps (ACA) is a fully managed serverless container platform that enables developers to design and deploy microservices and modern apps without requiring container expertise or needing infrastructure management. ACA is rapidly emerging as the preferred platform for hosting AI workloads and intelligent agents in the cloud. With features like code interpreter, Serverless GPUs, simplified deployments, and per-second billing, ACA empowers developers to build, deploy, and scale AI-driven applications with exceptional agility. ACA makes it easy to integrate agent frameworks, leverage GPU acceleration, and manage complex, multi-container AI environments - all while benefiting from a serverless, fully managed infrastructure. External customers like Replit, NFL Combine, Coca-Cola, and European Space Agency as well as internal teams like Microsoft Copilot (as well as many others) have bet on ACA as their compute platform for AI workloads. ACA is quickly becoming the leading platform for updating existing applications and moving them to a cloud-native setup. It allows organizations to seamlessly migrate legacy workloads - such as Java and .NET apps - by using AI-powered tools like GitHub Copilot to automate code upgrades, analyze dependencies, and handle cloud transformations. ACA’s fully managed, serverless environment removes the complexity of container orchestration. This helps teams break down monolithic or on-premises applications into robust microservices, making use of features like version control, traffic management, and advanced networking for fast iteration and deployment. By following proven modernization strategies while ensuring strong security, scalability, and developer efficiency, ACA helps organizations continuously innovate and future-proof their applications in the cloud. Customers like EY, London Stock Exchange, Chevron, and Paychex have unlocked significant business value by modernizing their workloads onto ACA. This blog presents the latest features and capabilities of ACA, enhancing its value for customers by enabling the rapid migration of existing workloads and development of new cloud applications, all while following cloud-native best practices. Secure sandboxes for AI compute ACA now supports dynamic shell sessions, currently available in public preview. These shell sessions are platform-managed built-in containers designed to execute common shell commands within an isolated, sandboxed environment. With the addition of empty shell sessions and an integrated MCP server, ACA enables customers to provision secure, isolated sandboxes instantly - ideal for use cases such as code execution, tool testing, and workflow automation. This functionality facilitates seamless integration with agent frameworks, empowering agents to access disposable compute environments as needed. Customers can benefit from rapid provisioning, improved security, and decreased operational overhead when managing agentic workloads. To learn more about how to add secure sandbox shell sessions to Microsoft Foundry agents as a tool, visit the walkthrough at https://aka.ms/aca/dynamic-sessions-mcp-tutorial. Docker Compose for Agents support ACA has added Docker Compose for Agents support in public preview, making it easy for developers to define agentic applications stack-agnostic, with MCP and custom model support. Combined with native serverless GPU support, Docker Compose for Agents allows fast iteration and scaling for AI-driven agents and application using LangGraph, LangChain CrewAI, Spring AI, Vercel AI SDK and Agno. These enhancements provide a developer-focused platform that streamlines the process for modern AI workloads, bringing together both development and production cycles into one unified environment. Additional regional availability for Serverless GPUs Serverless GPU solutions offer capabilities such as automatic scaling with NVIDIA A100 or T4 GPUs, per-second billing, and strict data isolation within container boundaries. ACA Serverless GPUs are now generally available in 11 additional regions, further facilitating developers’ ability to deploy AI inference, model training, and GPU-accelerated workloads efficiently. For further details on supported regions, please visit https://aka.ms/aca/serverless-gpu-regions. New Flexible Workload Profile The Flexible workload profile is a new option that combines the simplicity of serverless Consumption with the performance and control in Dedicated profiles. It offers a familiar pay-per-use model along with enhanced features like scheduled maintenance, dedicated networking, and support for larger replicas to meet demanding application needs. Customers can enjoy the advantages of dedicated resources together with effortless infrastructure management and billing from the Consumption model. Operating on a dedicated compute pool, this profile ensures better predictability and isolation without introducing extra operational complexity. It is designed for users who want the ease of serverless scaling, but also need more control over performance and environmental stability. Confidential Computing Confidential computing support is now available in public preview for ACA, offering hardware-based Trusted Execution Environments (TEEs) to secure data in use. This adds to existing encryption of data at rest and in transit by encrypting memory and verifying the cloud environment before processing. It helps protect sensitive data from unauthorized access, including by cloud operators, and is useful for organizations with high security needs. Confidential computing can be enabled via workload profiles, with the preview limited to certain regions. Extending Network capabilities General Availability of Rule-based Routing Rule-based routing for ACA is now generally available, offering users improved flexibility and easier composition when designing microservice architectures, conducting A/B testing, or implementing blue-green deployments. With this feature, you can route incoming HTTP traffic to specific apps within your environment by specifying host names or paths - including support for custom domains. You no longer need to set up an extra reverse proxy (like NGINX); simply define routing rules for your environment, and traffic will be automatically directed to the appropriate target apps. General Availability of Premium Ingress ACA support for Premium Ingress is now Generally Available. This feature introduces environment-level ingress configuration options, with the primary highlight being customizable ingress scaling. This capability supports the scaling of the ingress proxy, enabling customers to better handle higher demand workloads, such as large performance tests. By configuring your ingress proxy to run on workload profiles, you can scale out more ingress instances to handle more load. Running the ingress proxy on a workload profile will incur associated costs. To further enhance the flexibility of your application, this release includes other ingress-related settings, such as termination grace period, idle request timeout, and header count. Additional Management capabilities Public Preview of Deployment labels ACA now offers deployment labels in public preview, letting you assign names like dev, staging, or prod to container revisions which can be automatically assigned. This makes environment management easier and supports advanced strategies such as A/B testing and blue-green deployments. Labels help route traffic, control revisions, and streamline rollouts or rollbacks with minimal hassle. With deployment labels, you can manage app lifecycles more efficiently and reduce complexity across environments. General Availability of Durable Task Scheduler support Durable Task Scheduler (DTS) support is now generally available on ACA, empowering users with a robust pro-code workflow solution. With DTS, you can define reliable, containerized workflows as code, benefiting from built-in state persistence and fault-tolerant execution. This enhancement streamlines the creation and administration of complex workflows by boosting scalability, reliability, and enabling efficient monitoring capabilities. What’s next ACA is redefining how developers build and deploy intelligent agents. Agents deployed to Azure Container Apps with Microsoft Agent Framework and Open Telemetry can also be plugged directly into Microsoft Foundry, giving teams a single pane of glass for their agents in Azure. With serverless scale, GPU-on-demand, and enterprise-grade isolation, ACA provides the ideal foundation for hosting AI agents securely and cost-effectively. Utilizing open-source frameworks such as n8n on ACA enables the deployment of no-code automation agents that integrate seamlessly with Azure OpenAI models, supporting intelligent routing, summarization, and adaptive decision-making processes. Similarly, running other agent frameworks like Goose AI Agent on ACA enables it to operate concurrently with model inference workloads (including Ollama and GPT-OSS) within a unified, secure environment. The inclusion of serverless GPU support allows for efficient hosting of large language models such as GPT-OSS, optimizing both cost and scalability for inference tasks. Furthermore, ACA facilitates the remote hosting of Model Context Protocol (MCP) servers, granting agents secure access to external tools and APIs via streamable HTTP transport. Collectively, these features enable organizations to develop, scale, and manage complex agentic workloads - from workflow automation to AI-driven assistants - while leveraging ACA’s enterprise-grade security, autoscaling capabilities, and developer-centric user experience. In addition to these, ACA also enables a wide range of cross-compatibility with various frameworks and services, making it an ideal platform for running Azure Functions on ACA, Distributed Application Runtime (Dapr) microservices, as well as polyglot apps across .NET / Java / JavaScript. As always, we invite you to visit our GitHub page for feedback, feature requests, or questions about Azure Container Apps, where you can open a new issue or up-vote existing ones. If you’re curious about what we’re working on next, check out our roadmap. We look forward to hearing from you!379Views0likes0CommentsRunning Self-hosted APIM Gateways in Azure Container Apps with VNet Integration
With Azure Container Apps we can run containerized applications, completely serverless. The platform itself handles all the orchestration needed to dynamically scale based on your set triggers (such as KEDA) and even scale-to-zero! I have been working a lot with customers recently on using Azure API Management (APIM) and the topic of how we can leverage Azure APIM to manage our internal APIs without having to expose a public IP and stay within compliance from a security standpoint, which leads to the use of a Self-Hosted Gateway. This offers a managed gateway deployed within their network, allowing a unified approach in managing their APIs while keeping all API communication in-network. The self-hosted gateway is deployed as a container and in this article, we will go through how to provision a self-hosted gateway on Azure Container Apps specifically. I assume there is already an Azure APIM instance provisioned and will dive into creating and configuring the self-hosted gateway on ACA. Prerequisites As mentioned, ensure you have an existing Azure API Management instance. We will be using the Azure CLI to configure the container apps in this walkthrough. To run the commands, you need to have the Azure CLI installed on your local machine and ensure you have the necessary permissions in your Azure subscription. Retrieve Gateway Deployment Settings from APIM First, we need to get the details for our gateway from APIM. Head over to the Azure portal and navigate to your API Management instance. - In the left menu, under Deployment and infrastructure, select Gateways. - Here, you'll find the gateway resource you provisioned. Click on it and go to Deployment. - You'll need to copy the Gateway Token and Configuration endpoint values. (these tell the self-hosted gateway which APIM instance and Gateway to register under) Create a Container Apps Environment Next, we need to create a Container Apps environment. This is where we will create the container app in which our self-hosted gateway will be hosted. Using Azure CLI: Create our VNet and Subnet for our ACA Environment As we want access to our internal APIs, when we create the container apps environment, we need to have the VNet created with a subnet available. Note: If we’re using Workload Profiles (we will in this walkthrough), then we need to delegate the subnet to Microsoft.App/environments. # Create the vnet az network vnet create --resource-group rgContosoDemo \ --name vnet-contoso-demo \ --location centralUS \ --address-prefix 10.0.0.0/16 # Create the subnet az network vnet subnet create --resource-group rgContosoDemo \ --vnet-name vnet-contoso-demo \ --name infrastructure-subnet \ --address-prefixes 10.0.0.0/23 # If you are using a workload profile (we are for this walkthrough) then delegate the subnet az network vnet subnet update --resource-group rgContosoDemo \ --vnet-name vnet-contoso-demo \ --name infrastructure-subnet \ --delegations Microsoft.App/environments Create the Container App Environment in out VNet az containerapp env create --name aca-contoso-env \ --resource-group rgContosoDemo \ --location centralUS \ --enable-workload-profiles Deploy the Self-Hosted Gateway to a Container App Creating the environment takes about 10 minutes and once complete, then comes the fun part—deploying the self-hosted gateway container image to a container app. Using Azure CLI: Create the Container App: az containerapp create --name aca-apim-demo-gateway \ --resource-group rgContosoDemo \ --environment aca-contoso-env \ --workload-profile-name "Consumption" \ --image "mcr.microsoft.com/azure-api-management/gateway:2.5.0" \ --target-port 8080 \ --ingress 'external' \ ---env-vars "config.service.endpoint"="<YOUR_ENDPOINT>" "config.service.auth"="<YOUR_TOKEN>" "net.server.http.forwarded.proto.enabled"="true" Here, you'll replace <YOUR_ENDPOINT> and <YOUR_TOKEN> with the values you copied earlier. Configure Ingress for the Container App: az containerapp ingress enable --name aca-apim-demo-gateway --resource-group rgContosoDemo --type external --target-port 8080 This command ensures that your container app is accessible externally. Verify the Deployment Finally, let's make sure everything is running smoothly. Navigate to the Azure portal and go to your Container Apps environment. Select the container app you created (aca-apim-demo-gateway) and navigate to Replicas to verify that it's running. You can use the status endpoint of the self-hosted gateway to determine if your gateway is running as well: curl -i https://aca-apim-demo-gateway.sillytreats-abcd1234.centralus.azurecontainerapps.io/status-012345678990abcdef Verify Gateway Health in APIM You can navigate in the Azure Portal to APIM and verify the gateway is showing up as healthy. Navigate to Deployment and Infrastructure, select Gateways then choose your Gateway. On the Overview page you’ll see the status of your gateway deployment. And that’s it! You've successfully deployed an Azure APIM self-hosted gateway in Azure Container Apps with VNet integration allowing access to your internal APIs with easy management from the APIM portal in Azure. This setup allows you to manage your APIs efficiently while leveraging the scalability and flexibility of Azure Container Apps. If you have any questions or need further assistance, feel free to ask. How are you feeling about this setup? Does it make sense, or is there anything you'd like to dive deeper into?1.7KViews3likes3CommentsScaling Azure Functions Python with orjson
Azure Functions now supports ORJSON in the Python worker, giving developers an easy way to boost performance by simply adding the library to their environment. Benchmarks show that ORJSON delivers measurable gains in throughput and latency, with the biggest improvements on small–medium payloads common in real-world workloads. In tests, ORJSON improved throughput by up to 6% on 35 KB payloads and significantly reduced response times under load, while also eliminating dropped requests in high-throughput scenarios. With its Rust-based speed, standards compliance, and drop-in adoption, ORJSON offers a straightforward path to faster, more scalable Python Functions without any code changes.359Views0likes0CommentsBuilding Agents on Azure Container Apps with Goose AI Agent, Ollama and gpt-oss
Azure Container Apps (ACA) is redefining how developers build and deploy intelligent agents. With serverless scale, GPU-on-demand, and enterprise-grade isolation, ACA provides the ideal foundation for hosting AI agents securely and cost-effectively. Last month we highlighted how you can deploy n8n on Azure Container Apps to go from click-to-build to a running AI based automation platform in minutes, with no complex setup or infrastructure management overhead. In this post, we’re extending that same simplicity to AI agents, where we’ll show why Azure Container Apps is the best platform for running open-source agentic frameworks like Goose. Whether you’re experimenting with open-source models or building enterprise-grade automation, ACA gives you the flexibility and security you need. Challenges when building and hosting AI agents Building and running AI agents in production presents its own set of challenges. These systems often need access to proprietary data and internal APIs, making security and data governance critical, especially when agents interact dynamically with multiple tools and models. At the same time, developers need flexibility to experiment with different frameworks without introducing operational overhead or losing isolation. Simplicity and performance are also key. Managing scale, networking, and infrastructure can slow down iteration, while separating the agent’s reasoning layer from its inference backend can introduce latency and added complexity from managing multiple services. In short, AI agent development requires security, simplicity, and flexibility to ensure reliability and speed at scale. Why ACA and serverless GPUs for hosting AI agents Azure Container Apps provide a secure, flexible, and developer-friendly platform for hosting AI agents and inference workloads side by side within the same ACA environment. This unified setup gives you centralized control over network policies, RBAC, observability, and more, while ensuring that both your agentic logic and model inference run securely within one managed boundary. ACA also provides the following key benefits: Security and data governance: Your agent runs in your private, fully isolated environment, with complete control over identity, networking, and compliance. Your data never leaves the boundaries of your container Serverless economics: Scale automatically to zero when idle, pay only for what you use — no overprovisioning, no wasted resources. Developer simplicity: One-command deployment, integrated with Azure identity and networking. No extra keys, infrastructure management, or manual setup are required. Inferencing flexibility with serverless GPUs: Bring any open-source, community, or custom model. Run your inferencing apps on serverless GPUs alongside your agentic applications within the same environment. For example, running gpt-oss models via Ollama inside ACA containers avoids costly hosted inference APIs and keeps sensitive data private. These capabilities let teams focus on innovation, not infrastructure, making ACA a natural choice for building intelligent agents. Deploy the Goose AI Agent to ACA The Goose AI Agent, developed by Block, is an open source, general-purpose agent framework designed for quick deployment and easy customization. Out of the box, it supports many features like email integration, github interactions, and local CLI and system tool access. It’s great for building ready-to-run AI assistants that can connect to other systems while having a modular design that makes customization simple on top of supporting great defaults out the box. By deploying Goose on ACA, you gain all the benefits of serverless scale, secure isolation, GPU-on-demand, while maintaining the ability to customize and iterate quickly. Get started: Deploy Goose on Azure Container Apps using this open-source starter template. In just a few minutes, you’ll have a private, self-contained AI agent running securely on Azure Container Apps, ready to handle real-world workloads without compromise. Goose running on Azure Container Apps adding some content to a README, submitting a PR and sending a summary email to the team. Additional Benefits of running Goose on ACA Running the Goose AI Agent on Azure Container Apps (ACA) showcases how simple and powerful hosting AI agents can be. Always available: Goose can run continuously—handling long-lived or asynchronous workloads for hours or days—without tying up your local machine. Cost efficiency: ACA’s pay-per-use, serverless GPU model eliminates high per-call inference costs, making it ideal for sustained or compute-intensive workloads. Seamless developer experience: The Goose-on-ACA starter template sets up everything for you—model server, web UI, and CLI endpoints—with no manual configuration required. With ACA, you can go from concept to a fully running agent in minutes, without compromising on security, scalability, or cost efficiency. Part of a Growing Ecosystem of Agentic frameworks on ACA ACA is quickly becoming the go-to platform for containerized AI and Agentic workloads. From n8n, Goose to other emerging open-source and commercial agent frameworks, developers can use ACA to experiment, scale, and secure their agents - all while taking advantage of serverless scale, GPU-on-demand, and complete network isolation. It’s the same developer-first workflow that powers modern applications, now extended to intelligent agents. Whether you’re building a single agent or an entire automation ecosystem, ACA provides the flexibility and reliability you need to innovate faster.300Views0likes0CommentsOpenAI Agent SDK Integration with Azure Durable Functions
Picture this: Your agent authored with the OpenAI Agent SDK is halfway through analyzing 10,000 customer reviews when it hits a rate limit and dies. All that progress? Gone. Your multi-agent workflow that took 30 minutes to orchestrate? Back to square one because of a rate limit throttle. If you've deployed AI agents in production, you probably know this frustration first-hand. Today, we're announcing a solution that makes your agents reliable: OpenAI Agent SDK Integration with Azure Durable Functions. This integration provides automatic state persistence, enabling your agents to survive any failure and continue exactly where they stopped. No more lost progress, no more starting over, just reliable agents that work. The Challenge with AI Agents Building AI agents that work reliably in production environments has proven to be one of the most significant challenges in modern AI development. As agent sophistication increases with complex workflows involving multiple LLM calls, tool executions, and agent hand-offs, the likelihood of encountering failures increases. This creates a fundamental problem for production AI systems where reliability is essential. Common failure scenarios include: Rate Limiting: Agents halt mid-process when hitting API rate limits during LLM calls Network Timeouts: workflows terminate due to connectivity issues System Crashes: Multi-agent systems fail when individual components encounter errors State Loss: Complex workflows restart from the beginning after any interruption Traditional approaches force developers to choose between building complex retry logic with significant code changes or accepting unreliable agent behavior. Neither option is suitable for production-grade AI systems that businesses depend on and that’s why we’re introducing this integration. Key Benefits of the OpenAI Agent SDK Integration with Azure Durable Functions Our solution leverages durable execution value propositions to address these reliability challenges while preserving the familiar OpenAI Agents Python SDK developer experience. The integration enables agent invocations hosted on Azure Functions to run within durable orchestration contexts where both agent LLM calls and tool calls are executed as durable operations. This integration delivers significant advantages for production AI systems such as: Enhanced Agent Resilience- Built-in retry mechanisms for LLM calls and tool executions enable agents to automatically recover from failures and continue from their last successful step Multi-Agent Orchestration Reliability- Individual agent failures don't crash entire multi-agent workflows, and complex orchestrations maintain state across system restarts Built-in Observability- Monitor agent progress through the Durable Task Scheduler dashboard with enhanced debugging and detailed execution tracking (only applicable when using the Durable Task Scheduler as the Durable Function backend). Seamless Developer Experience- Keep using the OpenAI Agents SDK interface you already know with minimal code changes required to add reliability Distributed Compute and Scalability – Agent workflow automatically scale across multiple compute instances. Core Integration Components: These powerful capabilities are enabled through just a few simple additions to your AI application: durable_openai_agent_orchestrator: Decorator that enables durable execution for agent invocations run_sync: Uses an existing OpenAI Agents SDK API that executes your agent with built-in durability create_activity_tool: Wraps tool calls as durable activities with automatic retry capabilities State Persistence: Maintains agentic workflow state across failures and restarts Hello World Example Let's see how this works in practice. Here's what code written using the OpenAI Agent SDK looks like: import asyncio from agents import Agent, Runner async def main(): agent = Agent( name="Assistant", instructions="You only respond in haikus.", ) result = await Runner.run(agent, "Tell me about recursion in programming.") print(result.final_output) With our added durable integration, it becomes: from agents import Agent, Runner @app.orchestration_trigger(context_name="context") @app.durable_openai_agent_orchestrator # Runs the agent invocation in the context of a durable orchestration def hello_world(context): agent = Agent( name="Assistant", instructions="You only respond in haikus.", ) result = Runner.run_sync(agent, "Tell me about recursion in programming.") # Provides synchronous execution with built-in durability return result.final_output rable Task Scheduler dashboard showcasing the agent LLM call as a durable operation Notice how little actually changed. We added app.durable_openai_agent_orchestrator decorator but your core agent logic stays the same. The run_sync* method provides execution with built-in durability, enabling your agents to automatically recover from failures with minimal code changes. When using the Durable Task Scheduler as your Durable Functions backend, you gain access to a detailed monitoring dashboard that provides visibility into your agent executions. The dashboard displays detailed inputs and outputs for both LLM calls and tool invocations, along with clear success/failure indicators, making it straightforward to diagnose and troubleshoot any unexpected behavior in your agent processes. A note about 'run_sync' In Durable Functions, orchestrators don’t usually benefit from invoking code asynchronously because their role is to define the workflow—tracking state, scheduling activities, and so on—not to perform actual work. When you call an activity, the framework records the decision and suspends the orchestrator until the result is ready. For example, when you call run_sync, the deterministic part of the call completes almost instantly, and the LLM call activity is scheduled for asynchronous execution. Adding extra asynchronous code inside the orchestrator doesn’t improve performance; it only breaks determinism and complicates replay. Reliable Tool Invocation Example For agents requiring tool interactions, there are two implementation approaches. The first option uses the @function_tool decorator from the OpenAI Agent SDK, which executes directly within the context of the durable orchestration. When using this approach, your tool functions must follow durable functions orchestration deterministic constraints. Additionally, since these functions run within the orchestration itself, they may be replayed as part of normal operations, making cost-conscious implementation necessary. from agents import Agent, Runner, function_tool class Weather(BaseModel): city: str temperature_range: str conditions: str @function_tool def get_weather(city: str) -> Weather: """Get the current weather information for a specified city.""" print("[debug] get_weather called") return Weather( city=city, temperature_range="14-20C", conditions="Sunny with wind." ) @app.orchestration_trigger(context_name="context") @app.durable_openai_agent_orchestrator def tools(context): agent = Agent( name="Hello world", instructions="You are a helpful agent.", tools=[get_weather], ) result = Runner.run_sync(agent, input="What's the weather in Tokyo?") return result.final_output The second approach uses the create_activity_tool function, which is designed for non-deterministic code or scenarios where rerunning the tool is expensive (in terms of performance or cost). This approach executes the tool within the context of a durable orchestration activity, providing enhanced monitoring through the Durable Task Scheduler dashboard and ensuring that expensive operations are not unnecessarily repeated during orchestration replays. from agents import Agent, Runner, function_tool class Weather(BaseModel): city: str temperature_range: str conditions: str @app.orchestration_trigger(context_name="context") @app.durable_openai_agent_orchestrator def weather_expert(context): agent = Agent( name="Hello world", instructions="You are a helpful agent.", tools=[ context.create_activity_tool(get_weather) ], ) result = Runner.run_sync(agent, "What is the weather in Tokio?") return result.final_output @app.activity_trigger(input_name="city") async def get_weather(city: str) -> Weather: weather = Weather( city=city, temperature_range="14-20C", conditions="Sunny with wind." ) return weather Leveraging Durable Functions Stateful App Patterns Beyond basic durability of agents, this integration provides access to the full Durable Functions orchestration context, enabling developers to implement sophisticated stateful application patterns when needed, such as: External Event Handling: Use context.wait_for_external_event() for human approvals, external system callbacks, or time-based triggers Fan-out/Fan-in: Coordinate multiple tasks (including sub orchestrations invoking agents) in parallel. Long-running Workflows: Implement workflows that span hours, days, or weeks with persistent state Conditional Logic: Build dynamic agent workflows based on runtime decisions and external inputs Human Interaction and Approval Workflows Example For scenarios requiring human oversight, you can leverage the orchestration context to implement approval workflows: .durable_openai_agent_orchestrator def agent_with_approval(context): # Run initial agent analysis agent = Agent(name="DataAnalyzer", instructions="Analyze the provided dataset") initial_result = Runner.run_sync(agent, context.get_input()) # Wait for human approval before proceeding approval_event = context.wait_for_external_event("approval_received") if approval_event.get("approved"): # Continue with next phase final_agent = Agent(name="Reporter", instructions="Generate final report") final_result = Runner.run_sync(final_agent, initial_result.final_output) return final_result.final_output else: return "Workflow cancelled by user" This flexibility allows you to build sophisticated agentic applications that combine the power of AI agents with enterprise-grade workflow orchestration patterns, all while maintaining the familiar OpenAI Agents SDK experience. Get Started Today This article only scratches the surface of what's possible with the OpenAI Agent SDK integration for Durable Functions The combination of familiar OpenAI Agents SDK patterns with added reliability opens new possibilities for building sophisticated AI systems that can handle real-world production workloads. The integration is designed for a smooth onboarding experience. Begin by selecting one of your existing agents and applying the transformation patterns demonstrated above (often requiring just a few lines of code changes). Documentation: https://aka.ms/openai-agents-with-reliability-docs Sample Applications: https://aka.ms/openai-agents-with-reliability-samples1.5KViews2likes2Comments