serverless

232 Topics

Search Less, Build More: Inner Sourcing with GitHub Copilot and ADO MCP Server
Developers burn cycles context‑switching: opening five repos to find a logging example, searching a wiki for a data masking rule, scrolling chat history for the latest pipeline pattern. Organisations that I speak to are often on the path of transformational platform engineering projects but always have the fear or doubt of "what if my engineers don't use these resources". While projects like Backstage still play a pivotal role in inner sourcing and discoverability I also empathise with developers who would argue "How would I even know in the first place, which modules have or haven't been created for reuse". In this blog we explore how we can ensure organisational standards and developer satisfaction without any heavy lifting on either side, no custom model training, no rewriting or relocating of repositories and no stagnant local data. Using GitHub Copilot + Azure DevOps MCP server (with the free `code_search` extension) we turn the IDE into an organizational knowledge interface. Instead of guessing or re‑implementing, engineers can start scaffolding projects or solving issues as they would normally (hopefully using Copilot) and without extra prompting. GitHub Copilot can lean into organisational standards and ensure recommendations are made with code snippets directly generated from existing examples. What Is the Azure DevOps MCP Server + code_search Extension? MCP (Model Context Protocol) is an open standard that lets agents (like GitHub Copilot) pull in structured, on-demand context from external systems. MCP servers contain natural language explanations of the tools that the agent can utilise allowing dynamic decision making of when to implement certain toolsets over others. The Azure DevOps MCP Server is the ADO Product Team's implementation of that standard. It exposes your ADO environment in a way Copilot can consume. Out of the box it gives you access to: Projects – list and navigate across projects in your organization. Repositories – browse repos, branches, and files. Work items – surface user stories, bugs, or acceptance criteria. Wiki's – pull policies, standards, and documentation. This means Copilot can ground its answers in live ADO content, instead of hallucinating or relying only on what’s in the current editor window. The ADO server runs locally from your own machine to ensure that all sensitive project information remains within your secure network boundary. This also means that existing permissions on ADO objects such as Projects or Repositories are respected. Wiki search tooling available out of the box with ADO MCP server is very useful however if I am honest I have seen these wiki's go unused with documentation being stored elsewhere either inside the repository or in a project management tool. This means any tool that needs to implement code requires the ability to accurately search the code stored in the repositories themself. That is where the code_search extension enablement in ADO is so important. Most organisations have this enabled already however it is worth noting that this pre-requisite is the real unlock of cross-repo search. This allows for Copilot to: Query for symbols, snippets, or keywords across all repos. Retrieve usage examples from code, not just docs. Locate standards (like logging wrappers or retry policies) wherever they live. Back every recommendation with specific source lines. In short: MCP connects Copilot to Azure DevOps. code_search makes that connection powerful by turning it into a discovery engine. What is the relevance of Copilot Instructions? One of the less obvious but most powerful features of GitHub Copilot is its ability to follow instructions files. Copilot automatically looks for these files and uses them as a “playbook” for how it should behave. There are different types of instructions you can provide: Organisational instructions – apply across your entire workspace, regardless of which repo you’re in. Repo-specific instructions – scoped to a particular repository, useful when one project has unique standards or patterns. Personal instructions – smaller overrides layered on top of global rules when a local exception applies. (Stored in .github/copilot-instructions.md) In this solution, I’m using a single personal instructions file. It tells Copilot: When to search (e.g., always query repos and wikis before answering a standards question). Where to look (Azure DevOps repos, wikis, and with code_search, the code itself). How to answer (responses must cite the repo/file/line or wiki page; if no source is found, say so). How to resolve conflicts (prefer dated wiki entries over older README fragments). As a small example, a section of a Copilot instruction file could look like this: # GitHub Copilot Instructions for Azure DevOps MCP Integration This project uses Azure DevOps with MCP server integration to provide organizational context awareness. Always check to see if the Azure DevOps MCP server has a tool relevant to the user's request. ## Core Principles ### 1. Azure DevOps Integration - **Always prioritize Azure DevOps MCP tools** when users ask about: - Work items, stories, bugs, tasks - Pull requests and code reviews - Build pipelines and deployments - Repository operations and branch management - Wiki pages and documentation - Test plans and test cases - Project and team information ### 2. Organizational Context Awareness - Before suggesting solutions, **check existing organizational patterns** by: - Searching code across repositories for similar implementations - Referencing established coding standards and frameworks - Looking for existing shared libraries and utilities - Checking architectural decision records (ADRs) in wikis ### 3. Cross-Repository Intelligence - When providing code suggestions: - **Search for existing patterns** in other repositories first - **Reference shared libraries** and common utilities - **Maintain consistency** with organizational standards - **Suggest reusable components** when appropriate ## Tool Usage Guidelines ### Work Items and Project Management When users mention bugs, features, tasks, or project planning: ``` ✅ Use: wit_my_work_items, wit_create_work_item, wit_update_work_item ✅ Use: wit_list_backlogs, wit_get_work_items_for_iteration ✅ Use: work_list_team_iterations, core_list_projects The result... To test this I created 3 ADO Projects each with between 1-2 repositories. The repositories were light with only ReadMe's inside containing descriptions of the "repo" and some code snippets examples for usage. I have then created a brand-new workspace with no context apart from a Copilot instructions document (which could be part of a repo scaffold or organisation wide) which tells Copilot to search code and the wikis across all ADO projects in my demo environment. It returns guidance and standards from all available repo's and starts to use it to formulate its response. In the screenshot I have highlighted some key parts with red boxes. The first being a section of the readme that Copilot has identified in its response, that part also highlighted within CoPilot chat response. I have highlighted the rather generic prompt I used to get this response at the bottom of that window too. Above I have highlighted Copilot using the MCP server tooling searching through projects, repo's and code. Finally the largest box highlights the instructions given to Copilot on how to search and how easily these could be optimised or changed depending on the requirements and organisational coding standards. How did I implement this? Implementation is actually incredibly simple. As mentioned I created multiple projects and repositories within my ADO Organisation in order to test cross-project & cross-repo discovery. I then did the following: Enable code_search - in your Azure DevOps organization (Marketplace → install extension). Login to Azure - Use the AZ CLI to authenticate to Azure with "az login". Create vscode/mcp.json file - Snippet is provided below, the organisation name should be changed to your organisations name. Start and enable your MCP server - In the mcp.json file you should see a "Start" button. Using the snippet below you will be prompted to add your organisation name. Ensure your Copilot agent has access to the server under "tools" too. View this setup guide for full setup instructions (azure-devops-mcp/docs/GETTINGSTARTED.md at main · microsoft/azure-devops-mcp) Create a Copilot Instructions file - with a search-first directive. I have inserted the full version used in this demo at the bottom of the article. Experiment with Prompts – Start generic (“How do we secure APIs?”). Review the output and tools used and then tailor your instructions. Considerations While this is a great approach I do still have some considerations when going to production. Latency - Using MCP tooling on every request will add some latency to developer requests. We can look at optimizing usage through copilot instructions to better identify when Copilot should or shouldn't use the ADO MCP server. Complex Projects and Repositories - While I have demonstrated cross project and cross repository retrieval my demo environment does not truly simulate an enterprise ADO environment. Performance should be tested and closely monitored as organisational complexity increases. Public Preview - The ADO MCP server is moving quickly but is currently still public preview. We have demonstrated in this article how quickly we can make our Azure DevOps content discoverable. While their are considerations moving forward this showcases a direction towards agentic inner sourcing. Feel free to comment below how you think this approach could be extended or augmented for other use cases! Resources MCP Server Config (/.vscode/mcp.json) { "inputs": [ { "id": "ado_org", "type": "promptString", "description": "Azure DevOps organization name (e.g. 'contoso')" } ], "servers": { "ado": { "type": "stdio", "command": "npx", "args": ["-y", "@azure-devops/mcp", "${input:ado_org}"] } } } Copilot Instructions (/.github/copilot-instructions.md) # GitHub Copilot Instructions for Azure DevOps MCP Integration This project uses Azure DevOps with MCP server integration to provide organizational context awareness. Always check to see if the Azure DevOps MCP server has a tool relevant to the user's request. ## Core Principles ### 1. Azure DevOps Integration - **Always prioritize Azure DevOps MCP tools** when users ask about: - Work items, stories, bugs, tasks - Pull requests and code reviews - Build pipelines and deployments - Repository operations and branch management - Wiki pages and documentation - Test plans and test cases - Project and team information ### 2. Organizational Context Awareness - Before suggesting solutions, **check existing organizational patterns** by: - Searching code across repositories for similar implementations - Referencing established coding standards and frameworks - Looking for existing shared libraries and utilities - Checking architectural decision records (ADRs) in wikis ### 3. Cross-Repository Intelligence - When providing code suggestions: - **Search for existing patterns** in other repositories first - **Reference shared libraries** and common utilities - **Maintain consistency** with organizational standards - **Suggest reusable components** when appropriate ## Tool Usage Guidelines ### Work Items and Project Management When users mention bugs, features, tasks, or project planning: ``` ✅ Use: wit_my_work_items, wit_create_work_item, wit_update_work_item ✅ Use: wit_list_backlogs, wit_get_work_items_for_iteration ✅ Use: work_list_team_iterations, core_list_projects ``` ### Code and Repository Operations When users ask about code, branches, or pull requests: ``` ✅ Use: repo_list_repos_by_project, repo_list_pull_requests_by_repo ✅ Use: repo_list_branches_by_repo, repo_search_commits ✅ Use: search_code for finding patterns across repositories ``` ### Documentation and Knowledge Sharing When users need documentation or want to create/update docs: ``` ✅ Use: wiki_list_wikis, wiki_get_page_content, wiki_create_or_update_page ✅ Use: search_wiki for finding existing documentation ``` ### Build and Deployment When users ask about builds, deployments, or CI/CD: ``` ✅ Use: pipelines_get_builds, pipelines_get_build_definitions ✅ Use: pipelines_run_pipeline, pipelines_get_build_status ``` ## Response Patterns ### 1. Discovery First Before providing solutions, always discover organizational context: ``` "Let me first check what patterns exist in your organization..." → Search code, check repositories, review existing work items ``` ### 2. Reference Organizational Standards When suggesting code or approaches: ``` "Based on patterns I found in your [RepositoryName] repository..." "Following your organization's standard approach seen in..." "This aligns with the pattern established in [TeamName]'s implementation..." ``` ### 3. Actionable Integration Always offer to create or update Azure DevOps artifacts: ``` "I can create a work item for this enhancement..." "Should I update the wiki page with this new pattern?" "Let me link this to the current iteration..." ``` ## Specific Scenarios ### New Feature Development 1. **Search existing repositories** for similar features 2. **Check architectural patterns** and shared libraries 3. **Review related work items** and planning documents 4. **Suggest implementation** based on organizational standards 5. **Offer to create work items** and documentation ### Bug Investigation 1. **Search for similar issues** across repositories and work items 2. **Check related builds** and recent changes 3. **Review test results** and failure patterns 4. **Provide solution** based on organizational practices 5. **Offer to create/update** bug work items and documentation ### Code Review and Standards 1. **Compare against organizational patterns** found in other repositories 2. **Reference coding standards** from wiki documentation 3. **Suggest improvements** based on established practices 4. **Check for reusable components** that could be leveraged ### Documentation Requests 1. **Search existing wikis** for related content 2. **Check for ADRs** and technical documentation 3. **Reference patterns** from similar projects 4. **Offer to create/update** wiki pages with findings ## Error Handling If Azure DevOps MCP tools are not available or fail: 1. **Inform the user** about the limitation 2. **Provide alternative approaches** using available information 3. **Suggest manual steps** for Azure DevOps integration 4. **Offer to help** with configuration if needed ## Best Practices ### Always Do: - ✅ Search organizational context before suggesting solutions - ✅ Reference existing patterns and standards - ✅ Offer to create/update Azure DevOps artifacts - ✅ Maintain consistency with organizational practices - ✅ Provide actionable next steps ### Never Do: - ❌ Suggest solutions without checking organizational context - ❌ Ignore existing patterns and implementations - ❌ Provide generic advice when specific organizational context is available - ❌ Forget to offer Azure DevOps integration opportunities --- **Remember: The goal is to provide intelligent, context-aware assistance that leverages the full organizational knowledge base available through Azure DevOps while maintaining development efficiency and consistency.**
owaino
Dec 16, 2025 Place Apps on Azure Blog
1.2KViews
1like
3Comments
[Design Pattern] Handling race conditions and state in serverless data pipelines
Hello community, I recently faced a tricky data engineering challenge involving a lot of Parquet files (about 2 million records) that needed to be ingested, transformed, and split into different entities. The hard part wasn't the volume, but the logic. We needed to generate globally unique, sequential IDs for specific columns while keeping the execution time under two hours. We were restricted to using only Azure Functions, ADF, and Storage. This created a conflict: we needed parallel processing to meet the time limit, but parallel processing usually breaks sequential ID generation due to race conditions on the counters. I documented the three architecture patterns we tested to solve this: Sequential processing with ADF (Safe, but failed the 2-hour time limit). 2. Parallel processing with external locking/e-tags on Table Storage (Too complex and we still hit issues with inserts). 3. A "Fan-Out/Fan-In" pattern using Azure Durable Functions and Durable Entities. We ended up going with Durable Entities. Since they act as stateful actors, they allowed us to handle the ID counter state sequentially in memory while the heavy lifting (transformation) ran in parallel. It solved the race condition issue without killing performance. I wrote a detailed breakdown of the logic and trade-offs here if anyone is interested in the implementation details: https://medium.com/@yahiachames/data-ingestion-pipeline-a-data-engineers-dilemma-and-azure-solutions-7c4b36f11351 I am curious if others have used Durable Entities for this kind of ETL work, or if you usually rely on an external database sequence to handle ID generation in serverless setups? Thanks, Chameseddine
Chameseddine
Dec 15, 2025 Place Azure Architecture
16Views
0likes
1Comment
Host remote MCP servers on Azure Functions
Model Context Protocol (MCP) servers allow AI agents to access external tools, data, and systems, greatly extending the capability and power of agents. When you’re ready to expose your MCP servers externally, within your organization or to the world, it’s important that the servers are run in a secure, scalable, and reliable environment. Azure Functions provides such a robust platform for hosting your remote MCP servers, offering high scalability with the Flex Consumption plan, built‑in authentication feature for Microsoft Entra and OAuth, and a serverless billing model. The platform also offers two hosting options for added flexibility and convenience. The options allow for hosting of MCP servers built with Azure Functions MCP extension or the official MCP SDKs. Azure Functions MCP Extension (GA) The MCP extension allows you to build and host servers using Azure Functions programming model, i.e. using triggers and bindings. The MCP tool trigger allows you to focus on implementing tools you want to expose, instead of worrying about handling protocol and server logistics. The MCP extension launched as public preview back in April and is now generally available, with support for .NET, Java, JavaScript, Python, and Typescript. New features in the extension Support for streamable-http transport Support for the newer streamable-http transport is added to the extension. Unless your client specifically requires the older Server-Sent Events (SSE) transport, you should use the streamable-http. The two transports have different endpoints in the extension: Transport Endpoint Streamable HTTP /runtime/webhooks/mcp Server-Sent Events (SSE) /runtime/webhooks/mcp/sse Defining server information You can use the extensions.mcp section in host.json to define MCP server information. { "version": "2.0", "extensions": { "mcp": { "instructions": "Some test instructions on how to use the server", "serverName": "TestServer", "serverVersion": "2.0.0", "encryptClientState": true, "messageOptions": { "useAbsoluteUriForEndpoint": false }, "system": { "webhookAuthorizationLevel": "System" } } } } Built-in server authentication and authorization The built-in feature implements the requirements of the MCP authorization protocol, such as issuing 401 challenge and hosting the Protected Resource Metadata document. You can configure it to use identity providers like Microsoft Entra for server authentication. In addition to server authenticating, you can also leverage this feature to implement on-behalf-of (OBO) auth flows where the client invokes a tool that accesses some downstream services on-behalf-of the user. Learn more about the built-in authentication and authorization feature. Mavin Build Plugin for Java For Java applications, the Maven Build Plugin (version 1.40.0) parses and verifies MCP tool annotations during build time. This process automatically generates the correct MCP extension configuration, ensuring that the MCP tool defined by the user is properly set up. The build-time analysis is especially beneficial for Java apps, as it allows developers to utilize the MCP extension without concerns about increased cold start times. We'll continuously enhance the plugin’s capabilities. Upcoming improvements, such as property type inference, will reduce manual configuration and make it even easier to use the McpToolTrigger. Get started Checkout the quickstarts to get an MCP extension server deployed in minutes: C# (.NET) remote-mcp-functions-dotnet Python remote-mcp-functions-python TypeScript (Node.js) remote-mcp-functions-typescript Java remote-mcp-functions-java References Learn more about the MCP extension and tool trigger in official documentations. Self‑hosted MCP server (public preview) In addition to the MCP extension, Azure Functions also supports hosting MCP servers implemented with the official SDKs. This is a suitable option for teams that have existing SDK‑based servers or who favor the SDK experience over the Functions programming model. There is no need to modify your server code; you can lift and shift these MCP servers to Azure Functions— which is why they are termed self‑hosted. The hosting capability supports the following features: Stateless servers that use the streamable-http transport. If you need your server to be stateful, consider using the Functions MCP extension for now. Servers implemented with Python, TypeScript, C#, or Java MCP SDK. Built-in server authentication and authorization like the MCP extension Hosting requirement Self-hosted MCP servers are deployed to the Azure Functions platform as custom handlers. You can think of custom handlers as lightweight web servers that receive events from the Functions host. The only requirement for hosting the MCP server is a file called host.json. Add this file to your project root to tell Functions how to run the server. An example host.json for a Python server looks like: { "version": "2.0", "configurationProfile": "mcp-custom-handler", "customHandler": { "description": { "defaultExecutablePath": "python", "arguments": ["path to main python script, e.g. hello.py"] }, "port": "8000" } } Get started Check out quickstarts to get your self-hosted MCP server deployed in minutes: C# (.NET) mcp-sdk-functions-hosting-dotnet Python mcp-sdk-functions-hosting-python TypeScript (Node.js) mcp-sdk-functions-hosting-node Java Coming soon! References Read the official documentation of self-hosted MCP servers and learn about integrations with Azure services like Foundry and API Center. For .NET developers - check out the overview of self-hosted MCP servers from the recent .NET Conference! We’d love to hear from you! Let us know your thoughts about hosting remote MCP server on Azure Functions. Does either of the options meet your needs? What other MCP features are you looking for? Let us know what you’d like us to prioritize next!
lily-ma
Dec 02, 2025 Place Apps on Azure Blog
744Views
3likes
1Comment
Build AI agent tools using remote MCP with Azure Functions
Today, we’re pleased to share an early experimental preview of triggers and bindings that allow you to build remote tools using Model Context Protocol (MCP) with Azure Functions.
MatthewHenderson
Dec 02, 2025 Place Apps on Azure Blog
19KViews
6likes
4Comments
Build full-stack Next.js apps with Azure Static Web Apps
Next.js is the most popular hybrid rendering frontend framework, and is the most popular React framework as well. With Azure Static Web Apps’ recently announced improved support for Next.js, we can easily deploy and host our Next.js applications on Azure, while leveraging Next.js’ recent features such as React Server Components and Server Actions. In this article, we’ll build and deploy a Next.js application to Azure Static Web Apps (and you can follow along with the free plan!).
thomasgauvinmsft
Dec 01, 2025 Place Apps on Azure Blog
8.8KViews
3likes
2Comments
Durable Task Extension for Microsoft Agent Framework で、堅牢なエージェントを構築する
（これは 2025/11/13 に出された製品チームの記事『Bulletproof agents with the durable task extension for Microsoft Agent Framework』を日本語に翻訳したものです。）本日 (2025/11/13)、Durable Task Extension for Microsoft Agent Framework のパブリックプレビューを発表できることを大変うれしく思います。この拡張機能は、Azure Durable Functions の実績ある耐久性のある実行 (durable execution) (クラッシュや再起動に耐える) と分散実行 (複数インスタンスで動作する) 機能を、Microsoft Agent Framework に直接組み込むことで、本番環境対応の、堅牢でスケーラブルな AI エージェントの構築方法を一新します。これにより、セッション管理、障害復旧、スケーリングを自動的に処理する、ステートフルで堅牢な AI エージェントを Azure にデプロイでき、開発者はエージェントのロジックに完全に集中できるようになります。たとえば、複数日にわたる会話でコンテキストを維持するカスタマーサービスエージェント、人間による承認 (human-in-the-loop approval workflow) を含むコンテンツパイプライン、または専門的な AI モデルを連携させる完全自動化されたマルチエージェントシステムを構築する場合でも、この Durable Task Extension for Microsoft Agent Framework は、サーバーレスのシンプルさで本番レベルの信頼性、スケーラビリティ、そして調整機能を提供します。 Durable Task Extension の主な機能：サーバーレスホスティング (Serverless Hosting)：Azure Functions 上にエージェントをデプロイし、数千のインスタンスからゼロまで自動スケーリングを実現しながら、サーバーレスアーキテクチャの利点を維持したまま完全な制御を保持します。自動セッション管理 (Automatic Session Management)：エージェントは、プロセスのクラッシュや再起動、インスタンス間の分散実行に耐える、完全な会話コンテキストを保持した永続的なセッションを維持します。決定的なマルチエージェントオーケストレーション (Deterministic Multi-Agent Orchestrations)：コードで制御された、予測可能かつ再現性のある実行パターンで、特化した (specialized) durable agents を組み合わせて動作させる。 (訳註１：「決定的な (deterministic)」とは、同じ入力に対しては常に同じ結果を返すもので、その動作が予測可能なものを指します) (訳註２：「durable agent」とは、このフレームワークのエージェントをそう呼んでおり、普通のエージェントと違ってDurable な性質を持っているエージェントのことを指します) サーバーレスによるコスト削減を伴う Human-in-the-Loop (Human-in-the-Loop with Serverless Cost Savings)：人間の入力を待つ間、コンピュートリソースを消費せず、コストも発生しません。 Durable Task Scheduler による組み込みの可観測性 (Built-in Observability with Durable Task Scheduler)：Durable Task Scheduler の UI ダッシュボードを通じて、エージェントの操作やオーケストレーションを深く可視化できます。 Durable Agent を作成して実行してみる公式ドキュメント https://aka.ms/create-and-run-durable-agent コードサンプル (Python/C#) # Python endpoint = os.getenv("AZURE_OPENAI_ENDPOINT") deployment_name = os.getenv("AZURE_OPENAI_DEPLOYMENT_NAME", "gpt-4o-mini") # 標準的な Microsoft Agent Framework パターンに従って AI エージェントを作成します agent = AzureOpenAIChatClient( endpoint=endpoint, deployment_name=deployment_name, credential=AzureCliCredential() ).create_agent( instructions="""あなたは、どんなテーマに対しても読みやすく構造化された、魅力的なドキュメントを作成するプロフェッショナルなコンテンツライターです。テーマが与えられたら、次の手順で進めてください。 1. Web 検索ツールを使ってテーマをリサーチする 2. ドキュメントのアウトラインを生成する 3. 適切な書式で説得力のあるドキュメントを書く 4. 関連する例と出典（引用）を含める""", name="DocumentPublisher", tools=[ AIFunctionFactory.Create(search_web), AIFunctionFactory.Create(generate_outline) ] ) # Durable なセッション管理でエージェントをホストするように Function アプリを構成します app = AgentFunctionApp(agents=[agent]) app.run() // C# var endpoint = Environment.GetEnvironmentVariable("AZURE_OPENAI_ENDPOINT"); var deploymentName = Environment.GetEnvironmentVariable("AZURE_OPENAI_DEPLOYMENT") ?? "gpt-4o-mini"; // 標準的な Microsoft Agent Framework パターンに従って AI エージェントを作成します AIAgent agent = new AzureOpenAIClient(new Uri(endpoint), new DefaultAzureCredential()) .GetChatClient(deploymentName) .CreateAIAgent( instructions: """ あなたは、どんなテーマに対しても読みやすく構造化された、魅力的なドキュメントを作成するプロフェッショナルなコンテンツライターです。テーマが与えられたら、次の手順で進めてください。 1.Web 検索ツールを使ってテーマをリサーチする 2.ドキュメントのアウトラインを生成する 3.適切な書式で説得力のあるドキュメントを書く 4.関連する例と出典（引用）を含める """, name: "DocumentPublisher", tools: [ AIFunctionFactory.Create(SearchWeb), AIFunctionFactory.Create(GenerateOutline) ]); // Durable なスレッド管理でエージェントをホストするように Functions アプリを構成します // これにより、HTTP エンドポイントが自動で作成され、状態の永続化が管理されます using IHost app = FunctionsApplication .CreateBuilder(args) .ConfigureFunctionsWebApplication() .ConfigureDurableAgents(options => options.AddAIAgent(agent) ) .Build(); app.Run(); なぜ Durable Task Extension が必要なのか AI エージェントが、単純なチャットボットから、複雑で長時間実行されるタスクを処理する高度なシステムへと進化するにつれて、新たな課題が浮上します。会話が数日から数週間にわたるため、プロセスの再起動やクラッシュ、障害を超えて状態を保持する必要があります。ツール呼び出しが通常のタイムアウトを超える時間を要する場合があり、自動チェックポイントと復旧が必要です。大量のワークロードに対応するため、数千のエージェント会話を同時に処理できるよう、分散インスタンス間での弾力的なスケーリングが求められます。複数の専門エージェントを、信頼性の高いビジネスプロセスのために、予測可能で再現可能な実行パターンで調整する必要があります。エージェントは、処理を進める前に人間の承認を待つ必要がある場合があり、その間は理想的にはリソースを消費しない (課金されない) ことが望まれます。 Durable Extension は、Azure Durable Functions の機能を Microsoft Agent Framework に拡張することで、これらの課題に対応します。これにより、障害に耐え、弾力的にスケールし、耐久性と分散実行によって予測可能に動作する AI エージェントを構築できます。 4 つの柱 : 4D この拡張機能は、4 つの基本的な価値の柱、通称「4D」に基づいて構築されています。 Durability (耐久性) すべてのエージェントの状態変更（メッセージ、ツール呼び出し、意思決定）は、自動的に耐久性のあるチェックポイントとして保存されます。エージェントは、インフラ更新やクラッシュから復旧し、長時間の待機中にメモリからアンロードされてもコンテキストを失わずに再開できます。これは、長時間実行される処理や外部イベントを待機するエージェントに不可欠です。 Distributed (分散型の) エージェントの実行はすべてのインスタンスで利用可能であり、弾力的なスケーリングと自動フェイルオーバーを実現します。正常なノードは、障害が発生したインスタンスの作業をシームレスに引き継ぎ、継続的な運用を保証します。この分散実行モデルにより、数千のステートフルエージェントがスケールアップし、並列で動作できます。 Deterministic (決定性) エージェントのオーケストレーションは、通常のコードとして記述された命令型ロジックを使用して予測可能に実行されます。実行パスを定義することで、自動テスト、検証可能なガードレール、ステークホルダーが信頼できるビジネスクリティカルなワークフローを実現します。必要に応じて明示的な制御フローを提供し、エージェント主導のワークフローを補完します。 Debuggability (デバッグしやすさ) IDE、デバッガー、ブレークポイント、スタックトレース、単体テストなどの馴染みのある開発ツールやプログラミング言語を使用して開発・デバッグできます。エージェントとそのオーケストレーションはコードとして表現されるため、テスト、デバッグ、保守が容易です。実際の機能の動作サーバーレスホスティング (Serverless hosting) エージェントを Azure Functions （近日中に他の Azure サービスにも拡張予定）にデプロイし、使用していないときはゼロまで、使用時は数千インスタンスまで自動スケーリングします。消費したコンピューティングリソースに対してのみ料金を支払います。このコードファーストのデプロイ手法により、サーバーレスアーキテクチャの利点を維持しながら、コンピュート環境 (compute environment) を完全に制御できます。 # Python endpoint = os.getenv("AZURE_OPENAI_ENDPOINT") deployment_name = os.getenv("AZURE_OPENAI_DEPLOYMENT_NAME", "gpt-4o-mini") # 標準的な Microsoft Agent Framework パターンに従って AI エージェントを作成します agent = AzureOpenAIChatClient( endpoint=endpoint, deployment_name=deployment_name, credential=AzureCliCredential() ).create_agent( instructions="""あなたは、どんなテーマに対しても読みやすく構造化された、魅力的なドキュメントを作成するプロフェッショナルなコンテンツライターです。テーマが与えられたら、次の手順で進めてください。 1. Web 検索ツールを使ってテーマをリサーチする 2. ドキュメントのアウトラインを生成する 3. 適切な書式で説得力のあるドキュメントを書く 4. 関連する例と出典（引用）を含める""", name="DocumentPublisher", tools=[ AIFunctionFactory.Create(search_web), AIFunctionFactory.Create(generate_outline) ] ) # Durable なセッション管理でエージェントをホストするように Function アプリを構成します app = AgentFunctionApp(agents=[agent]) app.run() Automatic session management（自動セッション管理）エージェントのセッションは、Function アプリで構成した耐久性のあるストレージに自動的にチェックポイントされ、複数インスタンス間での耐久性と分散実行を可能にします。中断やプロセス障害の後でも、どのインスタンスからでもエージェントの実行を再開でき、継続的な運用が保証されます。内部的には、エージェントは Durable Entities として実装されています。これらは、実行間で状態を保持するステートフルなオブジェクトです。このアーキテクチャにより、各エージェントセッションは、会話履歴とコンテキストを保持した信頼性の高い長寿命のエンティティとして機能します。シナリオ例: 複数日から数週間にわたる複雑なサポート案件を処理するカスタマーサービスエージェント。エージェントが再デプロイされたり、別のインスタンスに移動した場合でも、会話履歴、コンテキスト、進捗は保持されます。 # 最初の対話 - ドキュメント作成用の新しいスレッドを開始 curl -X POST https://your-function-app.azurewebsites.net/api/agents/DocumentPublisher/threads \ -H "Content-Type: application/json" \ -d '{"message": "Azure Functions の利点についてのドキュメントを作成してください"}' # レスポンスにはスレッド ID と初期のドキュメントのアウトライン／下書きが含まれます # {"threadId": "doc789", "response": "Azure Functions の利点に関する網羅的なドキュメントを作成します。最新情報を検索します… [ドキュメント下書き] # Azure Functions の利点\n\n## はじめに\nAzure Functions は、インフラ管理なしでイベント駆動のコードを実行できるサーバーレスのコンピュートサービスです…\n\n## コスト効率\n- 実行時間に対してのみ支払う\n- アイドル状態のリソースには料金がかからない\n- 自動スケーリングにより過剰プロビジョニングを削減…\n\n## 開発者の生産性\n- 複数言語のサポート（C#, Python, JavaScript, Java）\n- 統合開発ツールと CI/CD …\n\n## スケーラビリティ\n- 需要に基づく自動スケーリング\n- 何百万ものリクエストをシームレスに処理…\n\n参考文献: [Azure ドキュメント、サーバーレス計算に関する研究]"} # 2 回目の対話 - 同じスレッドでドキュメントを改善 curl -X POST https://your-function-app.azurewebsites.net/api/agents/DocumentPublisher/threads/doc789 \ -H "Content-Type: application/json" \ -d '{"message": "他の Azure サービスとの統合に関するセクションを追加してもらえますか？"}' # エージェントは Azure Functions ドキュメントのコンテキストを保持し、要求されたセクションを追加します # {"threadId": "doc789", "response": "Azure Functions ドキュメントに、包括的な統合セクションを追加しました:\n\n## Azure サービスとの統合\n\n### Azure Storage\nBlob Storage、Queue Storage、Table Storage へのトリガーとバインディングにより、イベント駆動アーキテクチャをシームレスに実現…\n\n### Azure Event Grid と Event Hubs\nリアルタイムのイベントストリームを処理し、スケール可能な Pub/Sub パターンを実装…\n\n### Azure Cosmos DB\nドキュメントデータベース操作向けの組み込みバインディングと、変更フィードの自動処理…\n\n### Azure Service Bus\nエンタープライズメッセージング機能による信頼性の高いメッセージ処理…\n\n### Azure AI Services\nOpenAI、Cognitive Services、AI Search を容易に統合してインテリジェントなアプリケーションを実現…\n\nこのセクションはスケーラビリティのセクションの後に追加されています。ユースケースやデプロイのベストプラクティスも追加しましょうか？"} (訳者注：11/20 現在、上記のエンドポイント URL やリクエスト、レスポンスの形式は変更されています。この記事ではオリジナル記事の時のままの記載にしていますが、今後も (現在まだ preview 版で) 変わる可能性があるため、最新の情報は公式ドキュメントを参照してください：https://aka.ms/create-and-run-durable-agent ) Deterministic multi-agent orchestrations（決定的なマルチエージェントオーケストレーション）命令型コードを使用して、複数の専門的な durable agents を調整します。この場合、制御フローは開発者が定義します。これは、エージェントが次のステップを決定するエージェント主導のワークフローとは異なります。決定的オーケストレーションは、自動チェックポイントと復旧を備えた予測可能で再現可能な実行パターンを提供します。シナリオ例: メール処理システムで、まずスパム検出エージェントを使用し、その分類に基づいて条件付きで異なる専門エージェントにルーティングします。オーケストレーションは、どのステップで障害が発生しても自動的に復旧し、完了済みのエージェント呼び出しは再実行されません。 # Python app.orchestration_trigger(context_name="context") def document_publishing_orchestration(context: DurableOrchestrationContext): """複数の専門エージェントを協調させる決定的オーケストレーション。""" doc_request = context.get_input() # オーケストレーションのコンテキストから専門エージェントを取得 research_agent = context.get_agent("ResearchAgent") writer_agent = context.get_agent("DocumentPublisherAgent") # ステップ 1：Web 検索でトピックを調査する research_result = yield research_agent.run( messages=f"次のトピックを調査し、主要な情報を収集してください：{doc_request.topic}", response_schema=ResearchResult ) # ステップ 2：調査結果に基づいてアウトラインを生成する outline = yield context.call_activity("generate_outline", { "topic": doc_request.topic, "research_data": research_result.findings }) # ステップ 3：調査結果とアウトラインに基づいてドキュメントを作成する document = yield writer_agent.run( messages=f"""以下のトピックについて、網羅的なドキュメントを作成してください：{doc_request.topic} 調査結果: {research_result.findings} アウトライン: {outline} 適切な書式で、構造化され読みやすく、魅力的なドキュメントにしてください。必要に応じて出典（引用）も含めてください。""", response_schema=DocumentResponse ) # ステップ 4：生成したドキュメントを保存して公開する return yield context.call_activity("publish_document", { "title": doc_request.topic, "content": document.text, "citations": document.citations }) Human-in-the-loop（人間を介在させる仕組み）オーケストレーションやエージェントは、人間の入力、承認、レビューを待つ間、コンピュートリソースを消費せずに一時停止できます。アプリケーションがクラッシュや再起動したとしても、耐久性のある実行 (durable execution) により、数日から数週間にもわたる人間の応答をオーケストレーションが待機することが可能です。サーバーレスホスティングと組み合わせることで、待機期間中はすべてのコンピュートリソースが停止し、人間が入力を提供するまでコンピュートコストが完全に排除されます。シナリオ例: コンテンツ公開エージェントが下書きを生成し、人間のレビュー担当者に送信して、承認を数日間待機するケース。この間、レビュー期間中はコンピュートリソースを実行（または課金）しません。人間の応答が届くと、オーケストレーションは会話コンテキストと実行状態を完全に保持したまま自動的に再開します。 # Python app.orchestration_trigger(context_name="context") def content_approval_workflow(context: DurableOrchestrationContext): """人間を介在させるワークフロー（待機中はコストゼロ）""" topic = context.get_input() # ステップ 1：エージェントを使ってコンテンツを生成 content_agent = context.get_agent("ContentGenerationAgent") draft_content = yield content_agent.run(f"{topic} についての記事を書いてください") # ステップ 2：人間によるレビューを依頼 yield context.call_activity("notify_reviewer", draft_content) # ステップ 3：承認を待機（待機中はコンピュートリソースを消費しない） approval_event = context.wait_for_external_event("ApprovalDecision") timeout_task = context.create_timer(context.current_utc_datetime + timedelta(hours=24)) winner = yield context.task_any([approval_event, timeout_task]) if winner == approval_event: timeout_task.cancel() approved = approval_event.result if approved: result = yield context.call_activity("publish_content", draft_content) return result else: return "コンテンツは却下されました" else: # タイムアウト時：レビューをエスカレーション result = yield context.call_activity("escalate_for_review", draft_content) return result Built-in agent observability（エージェントの組み込み可観測性） Function App を Durable Task Scheduler を耐久バックエンドとして構成します（エージェントとオーケストレーションの状態を永続化する仕組み）。Durable Task Scheduler は、durable agents に推奨されるバックエンドであり、最高のスループット性能、完全に管理されたインフラ、そして UI ダッシュボードによる組み込みの可観測性を提供します。 Durable Task Scheduler ダッシュボードは、エージェントの操作を深く可視化します：会話履歴 (Conversation history): 各エージェントセッションの完全な会話スレッドを表示し、すべてのメッセージ、ツール呼び出し、任意時点のコンテキストを確認可能マルチエージェントの可視化 (Multi-agent visualization): 複数の専門エージェントを呼び出す際の実行フローを、エージェント間のハンドオフ、並列実行、条件分岐を含む視覚的な表現で表示パフォーマンス指標 (Performance metrics): エージェントの応答時間、トークン使用量、オーケストレーションの実行時間を監視実行履歴 (Execution history): デバッグ用に完全なリプレイ機能を備えた詳細な実行ログにアクセス可能 Demo Video Language support The Durable Task Extension は以下の言語をサポートしています: C# (.NET 8.0+) with Azure Functions Python (3.10+) with Azure Functions Support for additional computes coming soon. 今日から始めてみましょう Click here to create and run a durable agent Learn more Overview documentation C# Samples Python Samples 原文 Bulletproof agents with the durable task extension for Microsoft Agent Framework | Microsoft Community Hub
Madoka Chiyoda
Nov 20, 2025 Place Microsoft Developer Community Blog
1.6KViews
1like
0Comments
Bulletproof agents with the durable task extension for Microsoft Agent Framework
Today, we're thrilled to announce the public preview of the durable task extension for Microsoft Agent Framework. This extension transforms how you build production-ready, resilient and scalable AI agents by bringing the proven durable execution (survives crashes and restarts) and distributed execution (runs across multiple instances) capabilities of Azure Durable Functions directly into the Microsoft Agent Framework. Now you can deploy stateful, resilient AI agents to Azure that automatically handle session management, failure recovery, and scaling, freeing you to focus entirely on your agent logic. Whether you're building customer service agents that maintain context across multi-day conversations, content pipelines with human-in-the-loop approval workflows, or fully automated multi-agent systems coordinating specialized AI models, the durable task extension gives you production-grade reliability, scalability and coordination with serverless simplicity. Key features of the durable task extension include: Serverless Hosting: Deploy agents on Azure Functions with auto-scaling from thousands of instances to zero, while retaining full control in a serverless architecture. Automatic Session Management: Agents maintain persistent sessions with full conversation context that survives process crashes, restarts, and distributed execution across instances Deterministic Multi-Agent Orchestrations: Coordinate specialized durable agents with predictable, repeatable, code-driven execution patterns Human-in-the-Loop with Serverless Cost Savings: Pause for human input without consuming compute resources or incurring costs Built-in Observability with Durable Task Scheduler: Deep visibility into agent operations and orchestrations through the Durable Task Scheduler UI dashboard Click here to create and run a durable agent # Python endpoint = os.getenv("AZURE_OPENAI_ENDPOINT") deployment_name = os.getenv("AZURE_OPENAI_DEPLOYMENT_NAME", "gpt-4o-mini") # Create an AI agent following the standard Microsoft Agent Framework pattern agent = AzureOpenAIChatClient( endpoint=endpoint, deployment_name=deployment_name, credential=AzureCliCredential() ).create_agent( instructions="""You are a professional content writer who creates engaging, well-structured documents for any given topic. When given a topic, you will: 1. Research the topic using the web search tool 2. Generate an outline for the document 3. Write a compelling document with proper formatting 4. Include relevant examples and citations""", name="DocumentPublisher", tools=[ AIFunctionFactory.Create(search_web), AIFunctionFactory.Create(generate_outline) ] ) # Configure the function app to host the agent with durable session management app = AgentFunctionApp(agents=[agent]) app.run() // C# var endpoint = Environment.GetEnvironmentVariable("AZURE_OPENAI_ENDPOINT"); var deploymentName = Environment.GetEnvironmentVariable("AZURE_OPENAI_DEPLOYMENT") ?? "gpt-4o-mini"; // Create an AI agent following the standard Microsoft Agent Framework pattern AIAgent agent = new AzureOpenAIClient(new Uri(endpoint), new DefaultAzureCredential()) .GetChatClient(deploymentName) .CreateAIAgent( instructions: """You are a professional content writer who creates engaging, well-structured documents for any given topic. When given a topic, you will: 1. Research the topic using the web search tool 2. Generate an outline for the document 3. Write a compelling document with proper formatting 4. Include relevant examples and citations""", name: "DocumentPublisher", tools: [ AIFunctionFactory.Create(SearchWeb), AIFunctionFactory.Create(GenerateOutline) ]); // Configure the function app to host the agent with durable thread management // This automatically creates HTTP endpoints and manages state persistence using IHost app = FunctionsApplication .CreateBuilder(args) .ConfigureFunctionsWebApplication() .ConfigureDurableAgents(options => options.AddAIAgent(agent) ) .Build(); app.Run(); Why the durable task extension? As AI agents evolve from simple chatbots to sophisticated systems handling complex, long-running tasks, new challenges emerge: Conversations span multiple days and weeks, requiring persistent state across process restarts, crashes, and disruptions. Tool calls might take longer than typical timeouts allow, needing automatic checkpointing and recovery. High-volume workloads require elastic scaling across distributed instances to handle thousands of concurrent agent conversations. Multiple specialized agents need coordination with predictable, repeatable execution for reliable business processes. Agents sometimes must wait for human approval before proceeding, ideally without consuming resources. The Durable Extension addresses these challenges by extending Microsoft Agent Framework with capabilities from Azure Durable Functions, enabling you to build AI agents that survive failures, scale elastically, and execute predictably through durable and distributed execution. The extension is built on four foundational value pillars, which we refer to as the 4D’s: Durability Every agent state change (messages, tool calls, decisions) is durably checkpointed automatically. Agents survive and automatically resume from infrastructure updates, crashes, and can be unloaded from memory during long waiting periods without losing context. This is essential for agents that orchestrate long-running operations or wait for external events. Distributed Agent execution is accessible across all instances, enabling elastic scaling and automatic failover. Healthy nodes seamlessly take over work from failed instances, ensuring continuous operation. This distributed execution model allows thousands of stateful agents to scale up and run in parallel. Deterministic Agent orchestrations execute predictably using imperative logic written as ordinary code. Define the execution path, enabling automated testing, verifiable guardrails, and business-critical workflows that stakeholders can trust. This complements agent-directed workflows by providing explicit control flow when needed. Debuggability Use familiar development tools (IDEs, debuggers, breakpoints, stack traces, and unit tests) and programming languages to develop and debug. Your agent and agent orchestrations are expressed as code, making them easily testable, debuggable, and maintainable. Features in action Serverless hosting Deploy agents to Azure Functions (with expansion to other Azure computes soon) with automatic scaling to thousands of instances or down to zero when not in use. Pay only for the compute resources you consume. This code-first deployment approach gives you full control over the compute environment while maintaining the benefits of a serverless architecture. # Python endpoint = os.getenv("AZURE_OPENAI_ENDPOINT") deployment_name = os.getenv("AZURE_OPENAI_DEPLOYMENT_NAME", "gpt-4o-mini") # Create an AI agent following the standard Microsoft Agent Framework pattern agent = AzureOpenAIChatClient( endpoint=endpoint, deployment_name=deployment_name, credential=AzureCliCredential() ).create_agent( instructions="""You are a professional content writer who creates engaging, well-structured documents for any given topic. When given a topic, you will: 1. Research the topic using the web search tool 2. Generate an outline for the document 3. Write a compelling document with proper formatting 4. Include relevant examples and citations""", name="DocumentPublisher", tools=[ AIFunctionFactory.Create(search_web), AIFunctionFactory.Create(generate_outline) ] ) # Configure the function app to host the agent with durable session management app = AgentFunctionApp(agents=[agent]) app.run() Automatic session management Agent sessions are automatically checkpointed in durable storage that you configure in your function app, enabling durable and distributed execution across multiple instances. Any instance can resume an agent's execution after interruptions or process failures, ensuring continuous operation. Under the hood, agents are implemented as durable entities. These are stateful objects that maintain their state across executions. This architecture enables each agent session to function as a reliable, long-lived entity with preserved conversation history and context. Example scenario: A customer service agent handling a complex support case over multiple days and weeks. The conversation history, context, and progress are preserved even if the agent is redeployed or moves to a different instance. # First interaction - start a new thread to create a document curl -X POST https://your-function-app.azurewebsites.net/api/agents/DocumentPublisher/threads \ -H "Content-Type: application/json" \ -d '{"message": "Create a document about the benefits of Azure Functions"}' # Response includes thread ID and initial document outline/draft # {"threadId": "doc789", "response": "I'll create a comprehensive document about Azure Functions benefits. Let me search for the latest information... [Document Draft] # Benefits of Azure Functions\n\n## Introduction\nAzure Functions is a serverless compute service that enables you to run event-driven code without managing infrastructure...\n\n## Cost Efficiency\n- Pay only for execution time\n- No charges for idle resources\n- Automatic scaling reduces over-provisioning...\n\n## Developer Productivity\n- Multiple language support (C#, Python, JavaScript, Java)\n- Integrated development tools and CI/CD...\n\n## Scalability\n- Automatic scaling based on demand\n- Handle millions of requests seamlessly...\n\nReferences: [Azure documentation, serverless computing studies]"} # Second interaction - continue the same thread to refine the document curl -X POST https://your-function-app.azurewebsites.net/api/agents/DocumentPublisher/threads/doc789 \ -H "Content-Type: application/json" \ -d '{"message": "Can you add a section about integration with other Azure services?"}' # Agent remembers the Azure Functions document context and adds the requested section # {"threadId": "doc789", "response": "I've added a comprehensive integration section to your Azure Functions document:\n\n## Integration with Azure Services\n\n### Azure Storage\nSeamless triggers and bindings for Blob Storage, Queue Storage, and Table Storage enable event-driven architectures...\n\n### Azure Event Grid & Event Hubs\nProcess real-time event streams and implement publish-subscribe patterns at scale...\n\n### Azure Cosmos DB\nBuilt-in bindings for document database operations with automatic change feed processing...\n\n### Azure Service Bus\nReliable message processing with enterprise messaging features...\n\n### Azure AI Services\nEasily integrate OpenAI, Cognitive Services, and AI Search for intelligent applications...\n\nThe document now includes this section after the Scalability section. Would you like me to add use cases or deployment best practices?"} Deterministic multi-agent orchestrations Coordinate multiple specialized durable agents using imperative code where you define the control flow. This differs from agent-directed workflows where the agent decides the next steps. Deterministic Orchestrations provide predictable, repeatable execution patterns with automatic checkpointing and recovery. Example scenario: An email processing system that uses a spam detection agent, then conditionally routes to different specialized agents based on the classification. The orchestration automatically recovers if any step fails and completed agent calls are not re-executed. # Python app.orchestration_trigger(context_name="context") def document_publishing_orchestration(context: DurableOrchestrationContext): """Deterministic orchestration coordinating multiple specialized agents.""" doc_request = context.get_input() # Get specialized agents from the orchestration context research_agent = context.get_agent("ResearchAgent") writer_agent = context.get_agent("DocumentPublisherAgent") # Step 1: Research the topic using web search research_result = yield research_agent.run( messages=f"Research the following topic and gather key information: {doc_request.topic}", response_schema=ResearchResult ) # Step 2: Generate outline based on research findings outline = yield context.call_activity("generate_outline", { "topic": doc_request.topic, "research_data": research_result.findings }) # Step 3: Write the document with the research and outline document = yield writer_agent.run( messages=f"""Create a comprehensive document about {doc_request.topic}. Research findings: {research_result.findings} Outline: {outline} Write a well-structured, engaging document with proper formatting and citations.""", response_schema=DocumentResponse ) # Step 4: Save and publish the generated document return yield context.call_activity("publish_document", { "title": doc_request.topic, "content": document.text, "citations": document.citations }) Human-in-the-loop Orchestrations and agents can pause for human input, approval, or review without consuming compute resources. Durable execution enables orchestrations to wait for days or even weeks while waiting for human responses, even if the app crashes or restarts. When combined with serverless hosting, all compute resources are spun down during the wait period, eliminating compute costs until the human provides their input. Example scenario: A content publishing agent that generates drafts, sends them to human reviewers, and waits days for approval without running (or paying for) compute resources during the review period. When the human response arrives, the orchestration automatically resumes with full conversation context and execution state intact. # Python app.orchestration_trigger(context_name="context") def content_approval_workflow(context: DurableOrchestrationContext): """Human-in-the-loop workflow with zero-cost waiting.""" topic = context.get_input() # Step 1: Generate content using an agent content_agent = context.get_agent("ContentGenerationAgent") draft_content = yield content_agent.run(f"Write an article about {topic}") # Step 2: Send for human review yield context.call_activity("notify_reviewer", draft_content) # Step 3: Wait for approval - no compute resources consumed while waiting approval_event = context.wait_for_external_event("ApprovalDecision") timeout_task = context.create_timer(context.current_utc_datetime + timedelta(hours=24)) winner = yield context.task_any([approval_event, timeout_task]) if winner == approval_event: timeout_task.cancel() approved = approval_event.result if approved: result = yield context.call_activity("publish_content", draft_content) return result else: return "Content rejected" else: # Timeout - escalate for review result = yield context.call_activity("escalate_for_review", draft_content) return result Built-in agent observability Configure your Function App with the Durable Task Scheduler as the durable backend (what persists agents and orchestration state). The Durable Task Scheduler is the recommended durable backend for your durable agents, offering the best throughput performance, fully managed infrastructure, and built-in observability through a UI dashboard. The Durable Task Scheduler dashboard provides deep visibility into your agent operations: Conversation history: View complete conversation threads for each agent session, including all messages, tool calls, and conversation context at any point in time Multi-agent visualization: See the execution flow when calling multiple specialized agents with visual representation of agent handoffs, parallel executions, and conditional branching Performance metrics: Monitor agent response times, token usage, and orchestration duration Execution history: Access detailed execution logs with full replay capability for debugging Demo Video Language support The Durable Extension supports: C# (.NET 8.0+) with Azure Functions Python (3.10+) with Azure Functions Support for additional computes coming soon. Get started today Click here to create and run a durable agent Learn more Overview documentation C# Samples Python Samples
greenie-msft
Nov 19, 2025 Place Apps on Azure Blog
6.5KViews
3likes
7Comments
Agentic Applications on Azure Container Apps with Microsoft Foundry
Agents have exploded in popularity over the last year, reshaping not only the kinds of applications developers build but also the underlying architectures required to run them. As agentic applications grow more complex by invoking tools, collaborating with other services, and orchestrating multi-step workflows, architectures are naturally shifting toward microservice patterns. Azure Container Apps is purpose-built for this world: a fully managed, serverless platform designed to run independent, composable services with autoscaling, pay-per-second pricing, and seamless app-to-app communication. By combining Azure Container Apps with the Microsoft Agent Framework (MAF) and Microsoft Foundry, developers can run containerized agents on ACA while using Foundry to visualize and monitor how those agents behave. Azure Container Apps handles scalable, high-performance execution of agent logic, and Microsoft Foundry lights up rich observability for reasoning, planning, tool calls, and errors through its integrated monitoring experience. Together, they form a powerful foundation for building and operating modern, production-grade agentic applications. In this blog, we’ll walk through how to build an agent running on Azure Container Apps using Microsoft Agent Framework and OpenTelemetry, and then connect its telemetry to Microsoft Foundry so you can see your ACA-hosted agent directly in the Foundry monitoring experience. Prerequisites An Azure account with an active subscription. If you don't have one, you can create one for free. Ensure you have a Microsoft Foundry project setup. If you don’t already have one, you can create a project from the Azure AI Foundry portal. Azure Developer CLI (azd) installed Git installed The Sample Agent The complete sample code is available in this repo and can be deployed end-to-end with a single command. It's a basic currency agent. This sample deploys: An Azure Container Apps environment An agent built with Microsoft Agent Framework (MAF) OpenTelemetry instrumentation using the Azure Monitor exporter Application Insights to collect agent telemetry A Microsoft Foundry resource Environment wiring to integrate the agent with Microsoft Foundry Deployment Steps Clone the repository: git clone https://github.com/cachai2/foundry-3p-agents-samples.git cd foundry-3p-agents-samples/azure Authenticate with Azure: azd auth login Set the following azd environment variable azd env set AZURE_AI_MODEL_DEPLOYMENT_NAME gpt-4.1-mini Deploy to Azure azd up This provisions your Azure Container App, Application Insights, logs pipeline and required environment variables. While deployment runs, let’s break down how the code becomes compatible with Microsoft Foundry. 1. Setting up the agent in Azure Container Apps To integrate with Microsoft Foundry, the agent needs two essential capabilities: Microsoft Agent Framework (MAF): Handles agent logic, tools, schema-driven execution, and emits standardized gen_ai.* spans. OpenTelemetry: Sends the required agent/model/tool spans to Application Insights, which Microsoft Foundry consumes for visualization and monitoring. Although this sample uses MAF, the same pattern works with any agent framework. MAF and LangChain currently provide the richest telemetry support out-of-the-box. 1.1 Configure Microsoft Agent Framework (MAF) The agent includes: A tool (get_exchange_rate) An agent created by ChatAgent A runtime manager (AgentRuntime) A FastAPI app exposing /invoke Telemetry is enabled using two components already present in the repo: configure_azure_monitor: Configures OpenTelemetry + Azure Monitor exporter + auto-instrumentation. setup_observability(): Enables MAF’s additional spans (gen_ai.*, tool spans, agent lifecycle spans). From the repo (_configure_observability()): from azure.monitor.opentelemetry import configure_azure_monitor from agent_framework.observability import setup_observability from opentelemetry.sdk.resources import Resource def _configure_observability() -> None: configure_azure_monitor( resource=Resource.create({"service.name": SERVICE_NAME}), connection_string=APPLICATION_INSIGHTS_CONNECTION_STRING, ) setup_observability(enable_sensitive_data=False) This gives you: gen_ai.model.* spans (model usage + token counts) tool call spans agent lifecycle & execution spans HTTP + FastAPI instrumentation Standardized telemetry required by Microsoft Foundry No manual TracerProvider wiring or OTLP exporter setup is needed. 1.2 OpenTelemetry Setup (Azure Monitor Exporter) In this sample, OpenTelemetry is fully configured by Azure Monitor’s helper: import os from azure.monitor.opentelemetry import configure_azure_monitor from opentelemetry.sdk.resources import Resource from agent_framework.observability import setup_observability SERVICE_NAME = os.getenv("ACA_SERVICE_NAME", "aca-currency-exchange-agent") configure_azure_monitor( resource=Resource.create({"service.name": SERVICE_NAME}), connection_string=os.getenv("APPLICATION_INSIGHTS_CONNECTION_STRING"), ) # Enable Microsoft Agent Framework gen_ai/tool spans on top of OTEL setup_observability(enable_sensitive_data=False) This automatically: Installs and configures the OTEL tracer provider Enables batching + exporting of spans Adds HTTP/FastAPI/Requests auto-instrumentation Sends telemetry to Application Insights Adds MAF’s agent + tool spans All required environment variables (such as APPLICATION_INSIGHTS_CONNECTION_STRING) are injected automatically by azd up. 2. Deploy the Model and Test Your Agent Once azd up completes, you're ready to deploy a model to the Microsoft Foundry instance and test it. Find the resource name of your deployed Azure AI Services from azd up and navigate to it. From there, open it in Microsoft Foundry, navigate to the Model Catalog and add the gpt-4.1-mini model. Find the resource name of your deployed Azure Container App and navigate to it. Copy the application URL Set your container app URL environment variable in your terminal. (The below commands are for WSL.) export APP_URL="Your container app URL" Now, go back to your terminal and run the following curl command to invoke the agent curl -X POST "$APP_URL/invoke" \ -H "Content-Type: application/json" \ -d '{ "prompt": "How do I convert 100 USD to EUR?" }' 3. Verifying Telemetry to Application Insights Once your Container App starts, you can validate telemetry: Open the Application Insights resource created by azd up Go to Logs Run these queries (make sure you're in KQL mode not simple mode) Check MAF-genAI spans: dependencies | where timestamp > ago(3h) | extend genOp = tostring(customDimensions["gen_ai.operation.name"]), genSys = tostring(customDimensions["gen_ai.system"]), reqModel = tostring(customDimensions["gen_ai.request.model"]), resModel = tostring(customDimensions["gen_ai.response.model"]) | summarize count() by genOp, genSys, reqModel, resModel | order by count_ desc Check agent + tools: dependencies | where timestamp > ago(1h) | extend genOp = tostring(customDimensions["gen_ai.operation.name"]), agent = tostring(customDimensions["gen_ai.agent.name"]), tool = tostring(customDimensions["gen_ai.tool.name"]) | where genOp in ("agent.run", "invoke_agent", "execute_tool") | project timestamp, genOp, agent, tool, name, target, customDimensions | order by timestamp desc If telemetry is flowing, you’re ready to plug your agent into Microsoft Foundry. 4. Connect Application Insights to Microsoft Foundry Microsoft Foundry uses your Application Insights resource to power: Agent monitoring Tool call traces Reasoning graphs Multi-agent orchestration views Error analysis To connect: Navigate to Monitoring in the left navigation pane of the Microsoft Foundry portal. Select the Application analytics tab. Select your application insights resource created from azd up Connect the resource to your AI Foundry project. Note: If you are unable to add your application insights connection this way, you may need to follow the following: Navigate to the Overview of your Foundry project -> Open in management center -> Connected resources -> New Connection -> Application Insights Foundry will automatically start ingesting: gen_ai.* spans tool spans agent lifecycle spans workflow traces No additional configuration is required. 5. Viewing Dashboards & Traces in Microsoft Foundry Once your Application Insights connection is added, you can view your agent’s telemetry directly in Microsoft Foundry’s Monitoring experience. 5.1 Monitoring The Monitoring tab shows high-level operational metrics for your application, including: Total inference calls Average call duration Overall success/error rate Token usage (when available) Traffic trends over time This view is useful for spotting latency spikes, increased load, or changes in usage patterns, and these visualizations are powered by the telemetry emitting from your agents in Azure Container Apps. 5.2 Traces Timeline The Tracing tab shows the full distributed trace of each agent request, including all spans emitted by Microsoft Foundry and your Azure Container App with Microsoft Agent Framework. You can see: Top-level operations such as invoke_agent, chat, and process_thread_run Tool calls like execute_tool_get_exchange_rate Internal MAF steps (create_thread, create_message, run tool) Azure credential calls (GET /msi/token) Input/output tokens and duration for each span This view gives you an end-to-end breakdown of how your agent executed, which tools it invoked, and how long each step took — essential for debugging and performance tuning. Conclusion By combining Azure Container Apps, the Microsoft Agent Framework, and OpenTelemetry, you can build agents that are not only scalable and production-ready, but also fully observable and orchestratable inside Microsoft Foundry. Container Apps provides the execution engine and autoscaling foundation, MAF supplies structured agent logic and telemetry, and Microsoft Foundry ties everything together with powerful planning, monitoring, and workflow visualization. This architecture gives you the best of both worlds: the flexibility of running your own containerized agents with the dependencies you choose, and the intelligence of Microsoft Foundry to coordinate multi-step reasoning, tool call, and cross-agent workflows. As the agent ecosystem continues to evolve, Azure Container Apps and Microsoft Foundry provide a strong, extensible foundation for building the next generation of intelligent, microservice-driven applications.
Cary_Chai
Nov 18, 2025 Place Apps on Azure Blog
643Views
1like
0Comments
Azure app platform at Ignite 2025: New innovations for all your apps and agents
Learn more about the latest Azure app platform innovations at Microsoft Ignite 2025 that help you build, modernize, and run all your apps and agents.
NagaSurendran
Nov 18, 2025 Place Apps on Azure Blog
1.1KViews
1like
0Comments
Azure Functions Ignite 2025 Update
Azure Functions is redefining event-driven applications and high-scale APIs in 2025, accelerating innovation for developers building the next generation of intelligent, resilient, and scalable workloads. This year, our focus has been on empowering AI and agentic scenarios: remote MCP server hosting, bulletproofing agents with Durable Functions, and first-class support for critical technologies like OpenTelemetry, .NET 10 and Aspire. With major advances in serverless Flex Consumption, enhanced performance, security, and deployment fundamentals across Elastic Premium and Flex, Azure Functions is the platform of choice for building modern, enterprise-grade solutions. Remote MCP Model Context Protocol (MCP) has taken the world by storm, offering an agent a mechanism to discover and work deeply with the capabilities and context of tools. When you want to expose MCP/tools to your enterprise or the world securely, we recommend you think deeply about building remote MCP servers that are designed to run securely at scale. Azure Functions is uniquely optimized to run your MCP servers at scale, offering serverless and highly scalable features of Flex Consumption plan, plus two flexible programming model options discussed below. All come together using the hardened Functions service plus new authentication modes for Entra and OAuth using Built-in authentication. Remote MCP Triggers and Bindings Extension GA Back in April, we shared a new extension that allows you to author MCP servers using functions with the MCP tool trigger. That MCP extension is now generally available, with support for C#(.NET), Java, JavaScript (Node.js), Python, and Typescript (Node.js). The MCP tool trigger allows you to focus on what matters most: the logic of the tool you want to expose to agents. Functions will take care of all the protocol and server logistics, with the ability to scale out to support as many sessions as you want to throw at it. [Function(nameof(GetSnippet))] public object GetSnippet( [McpToolTrigger(GetSnippetToolName, GetSnippetToolDescription)] ToolInvocationContext context, [BlobInput(BlobPath)] string snippetContent ) { return snippetContent; } New: Self-hosted MCP Server (Preview) If you’ve built servers with official MCP SDKs and want to run them as remote cloud‑scale servers without re‑writing any code, this public preview is for you. You can now self‑host your MCP server on Azure Functions—keep your existing Python, TypeScript, .NET, or Java code and get rapid 0 to N scaling, built-in server authentication and authorization, consumption-based billing, and more from the underlying Azure Functions service. This feature complements the Azure Functions MCP extension for building MCP servers using the Functions programming model (triggers & bindings). Pick the path that fits your scenario—build with the extension or standard MCP SDKs. Either way you benefit from the same scalable, secure, and serverless platform. Use the official MCP SDKs: # MCP.tool() async def get_alerts(state: str) -> str: """Get weather alerts for a US state. Args: state: Two-letter US state code (e.g. CA, NY) """ url = f"{NWS_API_BASE}/alerts/active/area/{state}" data = await make_nws_request(url) if not data or "features" not in data: return "Unable to fetch alerts or no alerts found." if not data["features"]: return "No active alerts for this state." alerts = [format_alert(feature) for feature in data["features"]] return "\n---\n".join(alerts) Use Azure Functions Flex Consumption Plan's serverless compute using Custom Handlers in host.json: { "version": "2.0", "configurationProfile": "mcp-custom-handler", "customHandler": { "description": { "defaultExecutablePath": "python", "arguments": ["weather.py"] }, "http": { "DefaultAuthorizationLevel": "anonymous" }, "port": "8000" } } Learn more about MCPTrigger and self-hosted MCP servers at https://aka.ms/remote-mcp Built-in MCP server authorization (Preview) The built-in authentication and authorization feature can now be used for MCP server authorization, using a new preview option. You can quickly define identity-based access control for your MCP servers with Microsoft Entra ID or other OpenID Connect providers. Learn more at https://aka.ms/functions-mcp-server-authorization. Better together with Foundry agents Microsoft Foundry is the starting point for building intelligent agents, and Azure Functions is the natural next step for extending those agents with remote MCP tools. Running your tools on Functions gives you clean separation of concerns, reuse across multiple agents, and strong security isolation. And with built-in authorization, Functions enables enterprise-ready authentication patterns, from calling downstream services with the agent’s identity to operating on behalf of end users with their delegated permissions. Build your first remote MCP server and connect it to your Foundry agent at https://aka.ms/foundry-functions-mcp-tutorial. Agents Microsoft Agent Framework 2.0 (Public Preview Refresh) We’re excited about the preview refresh 2.0 release of Microsoft Agent Framework that builds on battle hardened work from Semantic Kernel and AutoGen. Agent Framework is an outstanding solution for building multi-agent orchestrations that are both simple and powerful. Azure Functions is a strong fit to host Agent Framework with the service’s extreme scale, serverless billing, and enterprise grade features like VNET networking and built-in auth. Durable Task Extension for Microsoft Agent Framework (Preview) The durable task extension for Microsoft Agent Framework transforms how you build production-ready, resilient and scalable AI agents by bringing the proven durable execution (survives crashes and restarts) and distributed execution (runs across multiple instances) capabilities of Azure Durable Functions directly into the Microsoft Agent Framework. Combined with Azure Functions for hosting and event-driven execution, you can now deploy stateful, resilient AI agents that automatically handle session management, failure recovery, and scaling, freeing you to focus entirely on your agent logic. Key features of the durable task extension include: Serverless Hosting: Deploy agents on Azure Functions with auto-scaling from thousands of instances to zero, while retaining full control in a serverless architecture. Automatic Session Management: Agents maintain persistent sessions with full conversation context that survives process crashes, restarts, and distributed execution across instances Deterministic Multi-Agent Orchestrations: Coordinate specialized durable agents with predictable, repeatable, code-driven execution patterns Human-in-the-Loop with Serverless Cost Savings: Pause for human input without consuming compute resources or incurring costs Built-in Observability with Durable Task Scheduler: Deep visibility into agent operations and orchestrations through the Durable Task Scheduler UI dashboard Create a durable agent: endpoint = os.getenv("AZURE_OPENAI_ENDPOINT") deployment_name = os.getenv("AZURE_OPENAI_DEPLOYMENT_NAME", "gpt-4o-mini") # Create an AI agent following the standard Microsoft Agent Framework pattern agent = AzureOpenAIChatClient( endpoint=endpoint, deployment_name=deployment_name, credential=AzureCliCredential() ).create_agent( instructions="""You are a professional content writer who creates engaging, well-structured documents for any given topic. When given a topic, you will: 1. Research the topic using the web search tool 2. Generate an outline for the document 3. Write a compelling document with proper formatting 4. Include relevant examples and citations""", name="DocumentPublisher", tools=[ AIFunctionFactory.Create(search_web), AIFunctionFactory.Create(generate_outline) ] ) # Configure the function app to host the agent with durable session management app = AgentFunctionApp(agents=[agent]) app.run() Durable Task Scheduler dashboard for agent and agent workflow observability and debugging For more information on the durable task extension for Agent Framework, see the announcement: https://aka.ms/durable-extension-for-af-blog. Flex Consumption Updates As you know, Flex Consumption means serverless without compromise. It combines elastic scale and pay‑for‑what‑you‑use pricing with the controls you expect: per‑instance concurrency, longer executions, VNet/private networking, and Always Ready instances to minimize cold starts. Since launching GA at Ignite 2024 last year, Flex Consumption has had tremendous growth with over 1.5 billion function executions per day and nearly 40 thousand apps. Here’s what’s new for Ignite 2025: 512 MB instance size (GA). Right‑size lighter workloads, scale farther within default quota. Availability Zones (GA). Distribute instances across zones. Rolling updates (Public Preview). Unlock zero-downtime deployments of code or config by setting a single configuration. See below for more information. Even more improvements including: new diagnostic settingsto route logs/metrics, use Key Vault App Config references, new regions, and Custom Handler support. To get started, review Flex Consumption samples, or dive into the documentation to see how Flex can support your workloads. Migrating to Azure Functions Flex Consumption Migrating to Flex Consumption is simple with our step-by-step guides and agentic tools. Move your Azure Functions apps or AWS Lambda workloads, update your code and configuration, and take advantage of new automation tools. With Linux Consumption retiring, now is the time to switch. For more information, see: Migrate Consumption plan apps to the Flex Consumption plan Migrate AWS Lambda workloads to Azure Functions Durable Functions Durable Functions introduces powerful new features to help you build resilient, production-ready workflows: Distributed Tracing: lets you track requests across components and systems, giving you deep visibility into orchestration and activities with support for App Insights and OpenTelemetry. Extended Sessions support in .NET isolated: improves performance by caching orchestrations in memory, ideal for fast sequential activities and large fan-out/fan-in patterns. Orchestration versioning (public preview): enables zero-downtime deployments and backward compatibility, so you can safely roll out changes without disrupting in-flight workflows Durable Task Scheduler Updates Durable Task Scheduler Dedicated SKU (GA): Now generally available, the Dedicated SKU offers advanced orchestration for complex workflows and intelligent apps. It provides predictable pricing for steady workloads, automatic checkpointing, state protection, and advanced monitoring for resilient, reliable execution. Durable Task Scheduler Consumption SKU (Public Preview): The new Consumption SKU brings serverless, pay-as-you-go orchestration to dynamic and variable workloads. It delivers the same orchestration capabilities with flexible billing, making it easy to scale intelligent applications as needed. For more information see: https://aka.ms/dts-ga-blog OpenTelemetry support in GA Azure Functions OpenTelemetry is now generally available, bringing unified, production-ready observability to serverless applications. Developers can now export logs, traces, and metrics using open standards—enabling consistent monitoring and troubleshooting across every workload. Key capabilities include: Unified observability: Standardize logs, traces, and metrics across all your serverless workloads for consistent monitoring and troubleshooting. Vendor-neutral telemetry: Integrate seamlessly with Azure Monitor or any OpenTelemetry-compliant backend, ensuring flexibility and choice. Broad language support: Works with .NET (isolated), Java, JavaScript, Python, PowerShell, and TypeScript. Start using OpenTelemetry in Azure Functions today to unlock standards-based observability for your apps. For step-by-step guidance on enabling OpenTelemetry and configuring exporters for your preferred backend, see the documentation. Deployment with Rolling Updates (Preview) Achieving zero-downtime deployments has never been easier. The Flex Consumption plan now offers rolling updates as a site update strategy. Set a single property, and all future code deployments and configuration changes will be released with zero-downtime. Instead of restarting all instances at once, the platform now drains existing instances in batches while scaling out the latest version to match real-time demand. This ensures uninterrupted in-flight executions and resilient throughput across your HTTP, non-HTTP, and Durable workloads – even during intensive scale-out scenarios. Rolling updates are now in public preview. Learn more at https://aka.ms/functions/rolling-updates. Secure Identity and Networking Everywhere By Design Security and trust are paramount. Azure Functions incorporates proven best practices by design, with full support for managed identity—eliminating secrets and simplifying secure authentication and authorization. Flex Consumption and other plans offer enterprise-grade networking features like VNETs, private endpoints, and NAT gateways for deep protection. The Azure Portal streamlines secure function creation, and updated scenarios and samples showcase these identity and networking capabilities in action. Built-in authentication (discussed above) enables inbound client traffic to use identity as well. Check out our updated Functions Scenarios page with quickstarts or our secure samples gallery to see these identity and networking best practices in action. .NET 10 Azure Functions now supports .NET 10, bringing in a great suite of new features and performance benefits for your code. .NET 10 is supported on the isolated worker model, and it’s available for all plan types except Linux Consumption. As a reminder, support ends for the legacy in-process model on November 10, 2026, and the in-process model is not being updated with .NET 10. To stay supported and take advantage of the latest features, migrate to the isolated worker model. Aspire Aspire is an opinionated stack that simplifies development of distributed applications in the cloud. The Azure Functions integration for Aspire enables you to develop, debug, and orchestrate an Azure Functions .NET project as part of an Aspire solution. Aspire publish directly deploys to your functions to Azure Functions on Azure Container Apps. Aspire 13 includes an updated preview version of the Functions integration that acts as a release candidate with go-live support. The package will be moved to GA quality with Aspire 13.1. Java 25, Node.js 24 Azure Functions now supports Java 25 and Node.js 24 in preview. You can now develop functions using these versions locally and deploy them to Azure Functions plans. Learn how to upgrade your apps to these versions here In Summary Ready to build what’s next? Update your Azure Functions Core Tools today and explore the latest samples and quickstarts to unlock new capabilities for your scenarios. The guided quickstarts run and deploy in under 5 minutes, and incorporate best practices—from architecture to security to deployment. We’ve made it easier than ever to scaffold, deploy, and scale real-world solutions with confidence. The future of intelligent, scalable, and secure applications starts now—jump in and see what you can create!
paulyuk
Nov 18, 2025 Place Apps on Azure Blog
2.1KViews
0likes
0Comments