.NET
409 TopicsBulletproof agents with the durable task extension for Microsoft Agent Framework
Today, we're thrilled to announce the public preview of the durable task extension for Microsoft Agent Framework. This extension transforms how you build production-ready, resilient and scalable AI agents by bringing the proven durable execution (survives crashes and restarts) and distributed execution (runs across multiple instances) capabilities of Azure Durable Functions directly into the Microsoft Agent Framework. Now you can deploy stateful, resilient AI agents to Azure that automatically handle session management, failure recovery, and scaling, freeing you to focus entirely on your agent logic. Whether you're building customer service agents that maintain context across multi-day conversations, content pipelines with human-in-the-loop approval workflows, or fully automated multi-agent systems coordinating specialized AI models, the durable task extension gives you production-grade reliability, scalability and coordination with serverless simplicity. Key features of the durable task extension include: Serverless Hosting: Deploy agents on Azure Functions with auto-scaling from thousands of instances to zero, while retaining full control in a serverless architecture. Automatic Session Management: Agents maintain persistent sessions with full conversation context that survives process crashes, restarts, and distributed execution across instances Deterministic Multi-Agent Orchestrations: Coordinate specialized durable agents with predictable, repeatable, code-driven execution patterns Human-in-the-Loop with Serverless Cost Savings: Pause for human input without consuming compute resources or incurring costs Built-in Observability with Durable Task Scheduler: Deep visibility into agent operations and orchestrations through the Durable Task Scheduler UI dashboard Click here to create and run a durable agent # Python endpoint = os.getenv("AZURE_OPENAI_ENDPOINT") deployment_name = os.getenv("AZURE_OPENAI_DEPLOYMENT_NAME", "gpt-4o-mini") # Create an AI agent following the standard Microsoft Agent Framework pattern agent = AzureOpenAIChatClient( endpoint=endpoint, deployment_name=deployment_name, credential=AzureCliCredential() ).create_agent( instructions="""You are a professional content writer who creates engaging, well-structured documents for any given topic. When given a topic, you will: 1. Research the topic using the web search tool 2. Generate an outline for the document 3. Write a compelling document with proper formatting 4. Include relevant examples and citations""", name="DocumentPublisher", tools=[ AIFunctionFactory.Create(search_web), AIFunctionFactory.Create(generate_outline) ] ) # Configure the function app to host the agent with durable session management app = AgentFunctionApp(agents=[agent]) app.run() // C# var endpoint = Environment.GetEnvironmentVariable("AZURE_OPENAI_ENDPOINT"); var deploymentName = Environment.GetEnvironmentVariable("AZURE_OPENAI_DEPLOYMENT") ?? "gpt-4o-mini"; // Create an AI agent following the standard Microsoft Agent Framework pattern AIAgent agent = new AzureOpenAIClient(new Uri(endpoint), new DefaultAzureCredential()) .GetChatClient(deploymentName) .CreateAIAgent( instructions: """You are a professional content writer who creates engaging, well-structured documents for any given topic. When given a topic, you will: 1. Research the topic using the web search tool 2. Generate an outline for the document 3. Write a compelling document with proper formatting 4. Include relevant examples and citations""", name: "DocumentPublisher", tools: [ AIFunctionFactory.Create(SearchWeb), AIFunctionFactory.Create(GenerateOutline) ]); // Configure the function app to host the agent with durable thread management // This automatically creates HTTP endpoints and manages state persistence using IHost app = FunctionsApplication .CreateBuilder(args) .ConfigureFunctionsWebApplication() .ConfigureDurableAgents(options => options.AddAIAgent(agent) ) .Build(); app.Run(); Why the durable task extension? As AI agents evolve from simple chatbots to sophisticated systems handling complex, long-running tasks, new challenges emerge: Conversations span multiple days and weeks, requiring persistent state across process restarts, crashes, and disruptions. Tool calls might take longer than typical timeouts allow, needing automatic checkpointing and recovery. High-volume workloads require elastic scaling across distributed instances to handle thousands of concurrent agent conversations. Multiple specialized agents need coordination with predictable, repeatable execution for reliable business processes. Agents sometimes must wait for human approval before proceeding, ideally without consuming resources. The Durable Extension addresses these challenges by extending Microsoft Agent Framework with capabilities from Azure Durable Functions, enabling you to build AI agents that survive failures, scale elastically, and execute predictably through durable and distributed execution. The extension is built on four foundational value pillars, which we refer to as the 4D’s: Durability Every agent state change (messages, tool calls, decisions) is durably checkpointed automatically. Agents survive and automatically resume from infrastructure updates, crashes, and can be unloaded from memory during long waiting periods without losing context. This is essential for agents that orchestrate long-running operations or wait for external events. Distributed Agent execution is accessible across all instances, enabling elastic scaling and automatic failover. Healthy nodes seamlessly take over work from failed instances, ensuring continuous operation. This distributed execution model allows thousands of stateful agents to scale up and run in parallel. Deterministic Agent orchestrations execute predictably using imperative logic written as ordinary code. Define the execution path, enabling automated testing, verifiable guardrails, and business-critical workflows that stakeholders can trust. This complements agent-directed workflows by providing explicit control flow when needed. Debuggability Use familiar development tools (IDEs, debuggers, breakpoints, stack traces, and unit tests) and programming languages to develop and debug. Your agent and agent orchestrations are expressed as code, making them easily testable, debuggable, and maintainable. Features in action Serverless hosting Deploy agents to Azure Functions (with expansion to other Azure computes soon) with automatic scaling to thousands of instances or down to zero when not in use. Pay only for the compute resources you consume. This code-first deployment approach gives you full control over the compute environment while maintaining the benefits of a serverless architecture. # Python endpoint = os.getenv("AZURE_OPENAI_ENDPOINT") deployment_name = os.getenv("AZURE_OPENAI_DEPLOYMENT_NAME", "gpt-4o-mini") # Create an AI agent following the standard Microsoft Agent Framework pattern agent = AzureOpenAIChatClient( endpoint=endpoint, deployment_name=deployment_name, credential=AzureCliCredential() ).create_agent( instructions="""You are a professional content writer who creates engaging, well-structured documents for any given topic. When given a topic, you will: 1. Research the topic using the web search tool 2. Generate an outline for the document 3. Write a compelling document with proper formatting 4. Include relevant examples and citations""", name="DocumentPublisher", tools=[ AIFunctionFactory.Create(search_web), AIFunctionFactory.Create(generate_outline) ] ) # Configure the function app to host the agent with durable session management app = AgentFunctionApp(agents=[agent]) app.run() Automatic session management Agent sessions are automatically checkpointed in durable storage that you configure in your function app, enabling durable and distributed execution across multiple instances. Any instance can resume an agent's execution after interruptions or process failures, ensuring continuous operation. Under the hood, agents are implemented as durable entities. These are stateful objects that maintain their state across executions. This architecture enables each agent session to function as a reliable, long-lived entity with preserved conversation history and context. Example scenario: A customer service agent handling a complex support case over multiple days and weeks. The conversation history, context, and progress are preserved even if the agent is redeployed or moves to a different instance. # First interaction - start a new thread to create a document curl -X POST https://your-function-app.azurewebsites.net/api/agents/DocumentPublisher/threads \ -H "Content-Type: application/json" \ -d '{"message": "Create a document about the benefits of Azure Functions"}' # Response includes thread ID and initial document outline/draft # {"threadId": "doc789", "response": "I'll create a comprehensive document about Azure Functions benefits. Let me search for the latest information... [Document Draft] # Benefits of Azure Functions\n\n## Introduction\nAzure Functions is a serverless compute service that enables you to run event-driven code without managing infrastructure...\n\n## Cost Efficiency\n- Pay only for execution time\n- No charges for idle resources\n- Automatic scaling reduces over-provisioning...\n\n## Developer Productivity\n- Multiple language support (C#, Python, JavaScript, Java)\n- Integrated development tools and CI/CD...\n\n## Scalability\n- Automatic scaling based on demand\n- Handle millions of requests seamlessly...\n\nReferences: [Azure documentation, serverless computing studies]"} # Second interaction - continue the same thread to refine the document curl -X POST https://your-function-app.azurewebsites.net/api/agents/DocumentPublisher/threads/doc789 \ -H "Content-Type: application/json" \ -d '{"message": "Can you add a section about integration with other Azure services?"}' # Agent remembers the Azure Functions document context and adds the requested section # {"threadId": "doc789", "response": "I've added a comprehensive integration section to your Azure Functions document:\n\n## Integration with Azure Services\n\n### Azure Storage\nSeamless triggers and bindings for Blob Storage, Queue Storage, and Table Storage enable event-driven architectures...\n\n### Azure Event Grid & Event Hubs\nProcess real-time event streams and implement publish-subscribe patterns at scale...\n\n### Azure Cosmos DB\nBuilt-in bindings for document database operations with automatic change feed processing...\n\n### Azure Service Bus\nReliable message processing with enterprise messaging features...\n\n### Azure AI Services\nEasily integrate OpenAI, Cognitive Services, and AI Search for intelligent applications...\n\nThe document now includes this section after the Scalability section. Would you like me to add use cases or deployment best practices?"} Deterministic multi-agent orchestrations Coordinate multiple specialized durable agents using imperative code where you define the control flow. This differs from agent-directed workflows where the agent decides the next steps. Deterministic Orchestrations provide predictable, repeatable execution patterns with automatic checkpointing and recovery. Example scenario: An email processing system that uses a spam detection agent, then conditionally routes to different specialized agents based on the classification. The orchestration automatically recovers if any step fails and completed agent calls are not re-executed. # Python app.orchestration_trigger(context_name="context") def document_publishing_orchestration(context: DurableOrchestrationContext): """Deterministic orchestration coordinating multiple specialized agents.""" doc_request = context.get_input() # Get specialized agents from the orchestration context research_agent = context.get_agent("ResearchAgent") writer_agent = context.get_agent("DocumentPublisherAgent") # Step 1: Research the topic using web search research_result = yield research_agent.run( messages=f"Research the following topic and gather key information: {doc_request.topic}", response_schema=ResearchResult ) # Step 2: Generate outline based on research findings outline = yield context.call_activity("generate_outline", { "topic": doc_request.topic, "research_data": research_result.findings }) # Step 3: Write the document with the research and outline document = yield writer_agent.run( messages=f"""Create a comprehensive document about {doc_request.topic}. Research findings: {research_result.findings} Outline: {outline} Write a well-structured, engaging document with proper formatting and citations.""", response_schema=DocumentResponse ) # Step 4: Save and publish the generated document return yield context.call_activity("publish_document", { "title": doc_request.topic, "content": document.text, "citations": document.citations }) Human-in-the-loop Orchestrations and agents can pause for human input, approval, or review without consuming compute resources. Durable execution enables orchestrations to wait for days or even weeks while waiting for human responses, even if the app crashes or restarts. When combined with serverless hosting, all compute resources are spun down during the wait period, eliminating compute costs until the human provides their input. Example scenario: A content publishing agent that generates drafts, sends them to human reviewers, and waits days for approval without running (or paying for) compute resources during the review period. When the human response arrives, the orchestration automatically resumes with full conversation context and execution state intact. # Python app.orchestration_trigger(context_name="context") def content_approval_workflow(context: DurableOrchestrationContext): """Human-in-the-loop workflow with zero-cost waiting.""" topic = context.get_input() # Step 1: Generate content using an agent content_agent = context.get_agent("ContentGenerationAgent") draft_content = yield content_agent.run(f"Write an article about {topic}") # Step 2: Send for human review yield context.call_activity("notify_reviewer", draft_content) # Step 3: Wait for approval - no compute resources consumed while waiting approval_event = context.wait_for_external_event("ApprovalDecision") timeout_task = context.create_timer(context.current_utc_datetime + timedelta(hours=24)) winner = yield context.task_any([approval_event, timeout_task]) if winner == approval_event: timeout_task.cancel() approved = approval_event.result if approved: result = yield context.call_activity("publish_content", draft_content) return result else: return "Content rejected" else: # Timeout - escalate for review result = yield context.call_activity("escalate_for_review", draft_content) return result Built-in agent observability Configure your Function App with the Durable Task Scheduler as the durable backend (what persists agents and orchestration state). The Durable Task Scheduler is the recommended durable backend for your durable agents, offering the best throughput performance, fully managed infrastructure, and built-in observability through a UI dashboard. The Durable Task Scheduler dashboard provides deep visibility into your agent operations: Conversation history: View complete conversation threads for each agent session, including all messages, tool calls, and conversation context at any point in time Multi-agent visualization: See the execution flow when calling multiple specialized agents with visual representation of agent handoffs, parallel executions, and conditional branching Performance metrics: Monitor agent response times, token usage, and orchestration duration Execution history: Access detailed execution logs with full replay capability for debugging Demo Video Language support The Durable Extension supports: C# (.NET 8.0+) with Azure Functions Python (3.10+) with Azure Functions Support for additional computes coming soon. Get started today Click here to create and run a durable agent Learn more Overview documentation C# Samples Python Samples331Views1like0CommentsDisciplined Guardrail Development in enterprise application with GitHub Copilot
What Is Disciplined Guardrail-Based Development? In AI-assisted software development, approaches like Vibe Coding—which prioritize momentum and intuition—often fail to ensure code quality and maintainability. To address this, Disciplined Guardrail-Based Development introduces structured rules ("guardrails") that guide AI systems during coding and maintenance tasks, ensuring consistent quality and reliability. To get AI (LLMs) to generate appropriate code, developers must provide clear and specific instructions. Two key elements are essential: What to build – Clarifying requirements and breaking down tasks How to build it – Defining the application architecture The way these two elements are handled depends on the development methodology or process being used. Here are examples as follows. How to Set Up Disciplined Guardrails in GitHub Copilot To implement disciplined guardrail-based development with GitHub Copilot, two key configuration features are used: 1. Custom Instructions (.github/copilot-instructions.md): This file allows you to define persistent instructions that GitHub Copilot will always refer to when generating code. Purpose: Establish coding standards, architectural rules, naming conventions, and other quality guidelines. Best Practice: Instead of placing all instructions in a single file, split them into multiple modular files and reference them accordingly. This improves maintainability and clarity. Example Use: You might define rules like using camelCase for variables, enforcing error boundaries in React, or requiring TypeScript for all new code. https://docs.github.com/en/copilot/how-tos/configure-custom-instructions/add-repository-instructions 2. Chat Modes (.github/chatmodes/*.chatmode.md): These files define specialized chat modes tailored to specific tasks or workflows. Purpose: Customize Copilot’s behavior for different development contexts (e.g., debugging, writing tests, refactoring). Structure: Each .chatmode.md file includes metadata and instructions that guide Copilot’s responses in that mode. Example Use: A debug.chatmode.md might instruct Copilot to focus on identifying and resolving runtime errors, while a test.chatmode.md could prioritize generating unit tests with specific frameworks. https://code.visualstudio.com/docs/copilot/customization/custom-chat-modes The files to be created and their relationships are as follows. Next, there are introductions for the specific creation method. #1: Custom Instructions With custom instructions, you can define commands that are always provided to GitHub Copilot. The prepared files are always referenced during chat sessions and passed to the LLM (this can also be confirmed from the chat history). An important note is to split the content into several files and include links to those files within the .github/copilot-instructions.md file. Because it can become too long if everything is written in a single file. There are mainly two types of content that should be described in custom instructions: A: Development Process (≒ outcome + Creation Method) What documents or code will be created: requirements specification, design documents, task breakdown tables, implementation code, etc. In what order and by whom they will be created: for example, proceed in the order of requirements definition → design → task breakdown → coding. B: Application Architecture How will the outcome be defined in A be created? What technology stack and component structure will be used? A concrete example of copilot-instructions.md is shown below. # Development Rules ## Architecture - When performing design and coding tasks, always refer to the following architecture documents and strictly follow them as rules. ### Product Overview - Document the product overview in `.github/architecture/product.md` ### Technology Stack - Document the technologies used in `.github/architecture/techstack.md` ### Coding Standards - Document coding standards in `.github/architecture/codingrule.md` ### Project Structure - Document the project directory structure in `.github/architecture/structure.md` ### Glossary (Japanese-English) - Document the list of terms used in the project in `.github/architecture/dictionary.md` ## Development Flow - Follow a disciplined development flow and execute the following four stages in order (proceed to the next stage only after completing the current one): 1. Requirement Definition 2. Design 3. Task Breakdown 4. Coding ### 1. Requirement Definition - Document requirements in `docs/[subsystem_name]/[business_name]/requirement.md` - Use `requirement.chatmode.md` to define requirements - Focus on clarifying objectives, understanding the current situation, and setting success criteria - Once requirements are defined, obtain user confirmation before proceeding to the next stage ### 2. Design - Document design in `docs/[subsystem_name]/[business_name]/design.md` - Use `design.chatmode.md` to define the design - Define UI, module structure, and interface design - Once the design is complete, obtain user confirmation before proceeding to the next stage ### 3. Task Breakdown - Document tasks in `docs/[subsystem_name]/[business_name]/tasks.md` - Use `tasks.chatmode.md` to define tasks - Break down tasks into executable units and set priorities - Once task breakdown is complete, obtain user confirmation before proceeding to the next stage ### 4. Coding - Implement code under `src/[subsystem_name]/[business_name]/` - Perform coding task by task - Update progress in `docs/[subsystem_name]/[business_name]/tasks.md` - Report to the user upon completion of each task Note: The only file that is always sent to the LLM is `copilot-instructions.md`. Documents linked from there (such as `product.md` or `techstack.md`) are not guaranteed to be read by the LLM. That said, a reasonably capable LLM will usually review these files before proceeding with the work. If the LLM does not properly reference each file, you may explicitly add these architecture documents to the context. Another approach is to instruct the LLM to review these files in the **chat mode settings**, which will be described later. There are various “schools of thought” regarding application architecture, and it is still an ongoing challenge to determine exactly what should be defined and what documents should be created. The choice of architecture depends on factors such as the business context, development scale, and team structure, so it is difficult to prescribe a one-size-fits-all approach. That said, as a general guideline, it is desirable to summarize the following: Product Overview: Overview of the product, service, or business, including its overall characteristics Technology Stack: What technologies will be used to develop the application? Project Structure: How will folders and directories be organized during development? Module Structure: How will the application be divided into modules? Coding Rules: Rules for handling exceptions, naming conventions, and other coding practices Writing all of this from scratch can be challenging. A practical approach is to create template information with the help of Copilot and then refine it. Specifically, you can: Use tools like M365 Copilot Researcher to create content based on general principles Analyze a prototype application and have the architecture information summarized (using Ask mode or Edit mode, feed the solution files to a capable LLM for analysis) However, in most cases, the output cannot be used as-is. The structure may not be analyzed correctly (hallucinations may occur) Project-specific practices and rules may not be captured Use the generated content as a starting point, and then refine it to create architecture documentation tailored to your own project. When creating architecture documents for enterprise-scale application development, a useful approach is to distinguish between the foundational parts and the individual application parts. Discipline-based guardrail development is particularly effective when building multiple applications in a “cookie-cutter” style on top of a common foundation. A cler example of this is Data-Oriented Architecture (DOA). In DOA, individual business applications are built on top of a shared database that serves as the overall common foundation. In this case, the foundational parts (the database layer) should not be modified arbitrarily by individual developers. Instead, focus on how to standardize the development of the individual application parts (the blue-framed sections) while ensuring consistency. Architecture documentation should be organized with this distinction in mind, emphasizing the uniformity of application-level development built upon the stable foundation. #2 Chat Mode By default, GitHub Copilot provides three chat modes: Ask, Edit, and Agent. However, by creating files under .github/chatmodes/*.chatmode.md, you can customize the Agent mode to create chat modes tailored for specific tasks. Specifically, you can configure the following three aspects. Functionally, this allows you to perform a specific task without having to manually change the model or tools, or write detailed instructions each time: model: Specify the default LLM to use (Note: The user can still manually switch to another LLM if desired) tools: Restrict which tools can be used (Note: The user can still manually select other tools if desired) custom instructions: Provide custom instructions specific to this chat mode A concrete example of .github/chatmodes/*.chatmode.md is shown below. description: This mode is used for requirement definition tasks. model: Claude Sonnet 4 tools: ['changes', 'codebase', 'editFiles', 'fetch', 'findTestFiles', 'githubRepo', 'new', 'openSimpleBrowser', 'runCommands', 'search', 'searchResults', 'terminalLastCommand', 'terminalSelection', 'usages', 'vscodeAPI', 'mssql_connect', 'mssql_disconnect', 'mssql_list_servers', 'mssql_show_schema'] --- # Requirement Definition Mode In this mode, requirement definition tasks are performed. Specifically, the project requirements are clarified, and necessary functions and specifications are defined. Based on instructions or interviews with the user, document the requirements according to the format below. If any specifications are ambiguous or unclear, Copilot should ask the user questions to clarify them. ## File Storage Location Save the requirement definition file in the following location: - Save as `requirement.md` under the directory `docs/[subsystem_name]/[business_name]/` ## Requirement Definition Format While interviewing the user, document the following items in the Markdown file: - **Subsystem Name**: The name of the subsystem to which this business belongs - **Business Name**: The name of the business - **Overview**: A summary of the business - **Use Cases**: Clarify who uses this business, when/under what circumstances, and for what purpose, using the following structure: - **Who (Persona)**: User or system roles - **When/Under What Circumstances (Scenario)**: Timing when the business is executed - **Purpose (Goal)**: Objectives or expected outcomes of the business - **Importance**: The importance of the business (e.g., High, Medium, Low) - **Acceptance Criteria**: Conditions that must be satisfied for the requirement to be considered met - **Status**: Current state of the requirement (e.g., In Progress, Completed) ## After Completion - Once requirement definition is complete, obtain user confirmation and proceed to the next stage (Design). Tips for Creating Chat Modes Here are some tips for creating custom chat modes: Align with the development process: Create chat modes based on the workflow and the deliverables. Instruct the LLM to ask the user when unsure: Direct the LLM to request clarification from the user if any information is missing. Clarify what deliverables to create and where to save them: Make it explicit which outputs are expected and their storage locations. The second point is particularly important. Many AI (LLMs) tend to respond to user prompts in a sycophantic manner (known as sycophancy). As a result, they may fill in unspecified requirements or perform tasks that were not requested, often with the intention of being helpful. The key difference between Ask/Edit modes and Agent mode is that Agent mode allows the LLM to proactively ask questions and engage in dialogue with the user. However, unless the user explicitly includes a prompt such as “ask if you don’t know,” the AI rarely initiates questions on its own. By creating a custom chat mode and instructing the LLM to “ask the user when unsure,” you can fully leverage the benefits of Agent mode. About Tools You can easily check tool names from the list of available tools in the command palette. Alternatively, as shown in the diagram below, it can be convenient to open the custom chat mode file and specify the tool configuration. You can specify not only the MCP server functionality but also built-in tools and Copilot Extensions. Example of Actual Operation An example interaction when using this chat mode is as follows: The LLM behaves according to the custom instructions defined in the chat mode. When you answer questions from GHC, the LLM uses that information to reason and proceed with the task. However, the output is not guaranteed to be correct (hallucinations may occur) → A human should review the output and make any necessary corrections before committing. The basic approach to disciplined guardrail-based development has been covered above. In actual business application development, it is also helpful to understand the following two points: Referencing the database schema Integrated management of design documents and implementation code (Important) Reading the Database Schema In business application development, requirements definition and functional design are often based on the schema information of entities. There are two main ways to allow the system to read schema information: Dynamically read the schema from a development/test DB server using MCP or similar tools. Include a file containing schema information within the project and read from it. A development/test database can be prepared, and schema information can be read via the MCP server or Copilot Extensions. For SQL Server or Azure SQL Database, an MCP Server is available, but its setup can be cumbersome. Therefore, using Copilot Extensions is often easier and recommended. This approach is often seen online, but it is not recommended for the following reasons: Setting up MCP Server or Copilot Extensions can be cumbersome (installation, connection string management, etc.) It is time-consuming (the LLM needs schema information → reads the schema → writes code based on it) Connecting to a DB server via MCP or similar tools is useful for scenarios such as “querying a database in natural language” for non-engineers performing data analysis. However, if the goal is simply to obtain the schema information of entities needed for business application development, the method described below is much simpler. Storing Schema Information Within the Project Place a file containing the schema information inside the project. Any of the following formats is recommended. Write custom instructions so that development refers to this file: DDL (full CREATE DATABASE scripts) O/R mapper files (e.g., Entity Framework context files) Text files documenting schema information, etc. DDL files are difficult for humans to read, but AI (LLMs) can easily read and accurately understand them. In .NET + SQL development, it is recommended to include both the DDL and EF O/R mapper files. Additionally, if you include links to these files in your architecture documents and chat mode instructions, the LLM can generate code while understanding the schema with high accuracy. Integrated Management of Design Documents and Implementation Code Disciplined guardrail-based development with LLMs has made it practical to synchronize and manage design documents and implementation code together—something that was traditionally very difficult. In long-standing systems, it is common for old design documents to become largely useless. During maintenance, code changes are often prioritized. As a result, updating and maintaining design documents tends to be neglected, leading to a significant divergence between design documents and the actual code. For these reasons, the following have been considered best practices (though often not followed in reality): Limit requirements and external design documents to the minimum necessary. Do not create internal design documents; instead, document within the code itself. Always update design documents before making changes to the implementation code. When using LLMs, guardrail-based development makes it easier to enforce a “write the documentation first” workflow. Following the flow of defining specifications, updating the documents, and then writing code also helps the LLM generate appropriate code more reliably. Even if code is written first, LLM-assisted code analysis can significantly reduce the effort required to update the documentation afterward. However, the following points should be noted when doing this: Create and manage design documents as text files, not Word, Excel, or PowerPoint. Use text-based technologies like Mermaid for diagrams. Clearly define how design documents correspond to the code. The last point is especially important. It is crucial to align the structure of requirements and design documents with the structure of the implementation code. For example: Place design documents directly alongside the implementation code. Align folder structures, e.g., /doc and /src. Information about grouping methods and folder mapping should be explicitly included in the custom instructions. Conclusion of Disciplined Guardrail-Based Development with GHC Formalizing and Applying Guardrails Define the development flow and architecture documents in .github/copilot-instructions.md using split references. Prepare .github/chatmodes/* for each development phase, enforcing “ask the AI if anything is unclear.” Synchronization of Documents and Implementation Code Update docs first → use the diff as the basis for implementation (Doc-first). Keep docs in text format (Markdown/Mermaid). Fix folder correspondence between /docs and /src. Handling Schemas Store DDL/O-R mapper files (e.g., EF) in the repository and have the LLM reference them. Minimize dynamic DB connections, prioritizing speed, reproducibility, and security. This disciplined guardrail-based development technique is an AI-assisted approach that significantly improves the quality, maintainability, and team efficiency of enterprise business application development. Adapt it appropriately to each project to maximize productivity in application development.700Views5likes0CommentsReal‑Time AI Streaming with Azure OpenAI and SignalR
TL;DR We’ll build a real-time AI app where Azure OpenAI streams responses and SignalR broadcasts them live to an Angular client. Users see answers appear incrementally just like ChatGPT while Azure SignalR Service handles scale. You’ll learn the architecture, streaming code, Angular integration, and optional enhancements like typing indicators and multi-agent scenarios. Why This Matters Modern users expect instant feedback. Waiting for a full AI response feels slow and breaks engagement. Streaming responses: Reduces perceived latency: Users see content as it’s generated. Improves UX: Mimics ChatGPT’s typing effect. Keeps users engaged: Especially for long-form answers. Scales for enterprise: Azure SignalR Service handles thousands of concurrent connections. What you’ll build A SignalR Hub that calls Azure OpenAI with streaming enabled and forwards partial output to clients as it arrives. An Angular client that connects over WebSockets/SSE to the hub and renders partial content with a typing indicator. An optional Azure SignalR Service layer for scalable connection management (thousands to millions of long‑lived connections). References: SignalR hosting & scale; Azure SignalR Service concepts. Architecture The hub calls Azure OpenAI with streaming enabled (await foreach over updates) and broadcasts partials to clients. Azure SignalR Service (optional) offloads connection scale and removes sticky‑session complexity in multi‑node deployments. References: Streaming code pattern; scale/ARR affinity; Azure SignalR integration. Prerequisites Azure OpenAI resource with a deployed model (e.g., gpt-4o or gpt-4o-mini) .NET 8 API + ASP.NET Core SignalR backend Angular 16+ frontend (using microsoft/signalr) Step‑by‑Step Implementation 1) Backend: ASP.NET Core + SignalR Install packages dotnet add package Microsoft.AspNetCore.SignalR dotnet add package Azure.AI.OpenAI --prerelease dotnet add package Azure.Identity dotnet add package Microsoft.Extensions.AI dotnet add package Microsoft.Extensions.AI.OpenAI --prerelease # Optional (managed scale): Azure SignalR Service dotnet add package Microsoft.Azure.SignalR Using DefaultAzureCredential (Entra ID) avoids storing raw keys in code and is the recommended auth model for Azure services. Program.cs var builder = WebApplication.CreateBuilder(args); builder.Services.AddSignalR(); // To offload connection management to Azure SignalR Service, uncomment: // builder.Services.AddSignalR().AddAzureSignalR(); builder.Services.AddSingleton<AiStreamingService>(); var app = builder.Build(); app.MapHub<ChatHub>("/chat"); app.Run(); AiStreamingService.cs - streams content from Azure OpenAI using Microsoft.Extensions.AI; using Azure.AI.OpenAI; using Azure.Identity; public class AiStreamingService { private readonly IChatClient _chatClient; public AiStreamingService(IConfiguration config) { var endpoint = new Uri(config["AZURE_OPENAI_ENDPOINT"]!); var deployment = config["AZURE_OPENAI_DEPLOYMENT"]!; // e.g., "gpt-4o-mini" var azureClient = new AzureOpenAIClient(endpoint, new DefaultAzureCredential()); _chatClient = azureClient.GetChatClient(deployment).AsIChatClient(); } public async IAsyncEnumerable<string> StreamReplyAsync(string userMessage) { var messages = new List<ChatMessage> { ChatMessage.CreateSystemMessage("You are a helpful assistant."), ChatMessage.CreateUserMessage(userMessage) }; await foreach (var update in _chatClient.CompleteChatStreamingAsync(messages)) { // Only text parts; ignore tool calls/annotations var chunk = string.Join("", update.Content .Where(p => p.Kind == ChatMessageContentPartKind.Text) .Select(p => ((TextContent)p).Text)); if (!string.IsNullOrEmpty(chunk)) yield return chunk; } } } Modern .NET AI extensions (Microsoft.Extensions.AI) expose a unified streaming pattern via CompleteChatStreamingAsync. ChatHub.cs - pushes partials to the caller using Microsoft.AspNetCore.SignalR; public class ChatHub : Hub { private readonly AiStreamingService _ai; public ChatHub(AiStreamingService ai) => _ai = ai; // Client calls: connection.invoke("AskAi", prompt) public async Task AskAi(string prompt) { var messageId = Guid.NewGuid().ToString("N"); await Clients.Caller.SendAsync("typing", messageId, true); await foreach (var partial in _ai.StreamReplyAsync(prompt)) { await Clients.Caller.SendAsync("partial", messageId, partial); } await Clients.Caller.SendAsync("typing", messageId, false); await Clients.Caller.SendAsync("completed", messageId); } } 2) Frontend: Angular client with microsoft/signalr Install the SignalR client npm i microsoft/signalr Create a SignalR service (Angular) // src/app/services/ai-stream.service.ts import { Injectable } from '@angular/core'; import * as signalR from '@microsoft/signalr'; import { BehaviorSubject, Observable } from 'rxjs'; @Injectable({ providedIn: 'root' }) export class AiStreamService { private connection?: signalR.HubConnection; private typing$ = new BehaviorSubject<boolean>(false); private partial$ = new BehaviorSubject<string>(''); private completed$ = new BehaviorSubject<boolean>(false); get typing(): Observable<boolean> { return this.typing$.asObservable(); } get partial(): Observable<string> { return this.partial$.asObservable(); } get completed(): Observable<boolean> { return this.completed$.asObservable(); } async start(): Promise<void> { this.connection = new signalR.HubConnectionBuilder() .withUrl('/chat') // same origin; use absolute URL if CORS .withAutomaticReconnect() .configureLogging(signalR.LogLevel.Information) .build(); this.connection.on('typing', (_id: string, on: boolean) => this.typing$.next(on)); this.connection.on('partial', (_id: string, text: string) => { // Append incremental content this.partial$.next((this.partial$.value || '') + text); }); this.connection.on('completed', (_id: string) => this.completed$.next(true)); await this.connection.start(); } async ask(prompt: string): Promise<void> { // Reset state per request this.partial$.next(''); this.completed$.next(false); await this.connection?.invoke('AskAi', prompt); } } Angular component // src/app/components/ai-chat/ai-chat.component.ts import { Component, OnInit } from '@angular/core'; import { AiStreamService } from '../../services/ai-stream.service'; @Component({ selector: 'app-ai-chat', templateUrl: './ai-chat.component.html', styleUrls: ['./ai-chat.component.css'] }) export class AiChatComponent implements OnInit { prompt = ''; output = ''; typing = false; done = false; constructor(private ai: AiStreamService) {} async ngOnInit() { await this.ai.start(); this.ai.typing.subscribe(on => this.typing = on); this.ai.partial.subscribe(text => this.output = text); this.ai.completed.subscribe(done => this.done = done); } async send() { this.output = ''; this.done = false; await this.ai.ask(this.prompt); } } HTML Template <!-- src/app/components/ai-chat/ai-chat.component.html --> <div class="chat"> <div class="prompt"> <input [(ngModel)]="prompt" placeholder="Ask me anything…" /> <button (click)="send()">Send</button> </div> <div class="response"> <pre>{{ output }}</pre> <div class="typing" *ngIf="typing">Assistant is typing…</div> <div class="done" *ngIf="done">✓ Completed</div> </div> </div> Streaming modes, content filters, and UX Azure OpenAI streaming interacts with content filtering in two ways: Default streaming: The service buffers output into content chunks and runs content filters before each chunk is emitted; you still stream, but not necessarily token‑by‑token. Asynchronous Filter (optional): The service returns token‑level updates immediately and runs filters asynchronously. You get ultra‑smooth streaming but must handle delayed moderation signals (e.g., redaction or halting the stream). Best practices Append partials in small batches client‑side to avoid DOM thrash; finalize formatting on "completed". Log full messages server‑side only after completion to keep histories consistent (mirrors agent frameworks). Security & compliance Auth: Prefer Microsoft Entra ID (DefaultAzureCredential) to avoid key sprawl; use RBAC and Managed Identities where possible. Secrets: Store Azure SignalR connection strings in Key Vault and rotate periodically; never hardcode. CORS & cross‑domain: When hosting frontend and hub on different origins, configure CORS and use absolute URLs in withUrl(...). Connection management & scaling tips Persistent connection load: SignalR consumes TCP resources; separate heavy real‑time workloads or use Azure SignalR to protect other apps. Sticky sessions (self‑hosted): Required in most multi‑server scenarios unless WebSockets‑only + SkipNegotiation applies; Azure SignalR removes this requirement. Learn more AI‑Powered Group Chat sample (ASP.NET Core): Azure OpenAI .NET client (auth & streaming): SignalR JavaScript ClientSimplifying Microservice Reliability with Dapr
What is Dapr? Dapr is an open-source runtime developed by Microsoft that is used in building resilient, event-driven, and portable applications. It works using the sidecar pattern, meaning every microservice gets a small companion container — the Dapr sidecar — which handles communication, retries, secrets, state, and more. What is Sidecar ? A sidecar is a helper process that runs beside your app, handling system tasks so your code can focus on business logic. Lets see some offerings from Dapr along with examples. #1 . Bindings Connects your app to external systems (like queues, email, or storage) with zero SDK or protocol handling. Without Dapr ❌ var httpClient = new HttpClient(); await httpClient.PostAsJsonAsync("https://api.sendgrid.com/send", email); * Manage HTTP endpoints & credentials * Change provider → rewrite logic With Dapr ✅ await daprClient.InvokeBindingAsync("send-email", "create", email); * One call, no SDK * Replace SendGrid → SMTP → Twilio just by editing config * No code change, no redeploy How to enable binding in for a Azure Container App Open Azure Portal → go to your Container App Environment. From the left pane, click on Container Apps, and choose your desired app (e.g., orders-api). In the Settings section, select Dapr. Enable Dapr toggle → switch it ON. Provide the basic Dapr settings: App ID: A unique name (e.g., orders-app). App Port: The internal port your API listens on (e.g., 8080). App Protocol: Choose HTTP or gRPC (usually HTTP). Click Save to apply. Now, under the same Container App Environment, go to Dapr Components. Click Create → select Binding → choose the type of binding (e.g., azure.storagequeues). #2 . Configuration Centralizes app settings, allowing live configuration updates without redeploying services. Without Dapr ❌ var featureFlag = Configuration["FeatureX"]; * Requires redeploys for every config change * No centralized versioning or dynamic update With Dapr ✅ var config = await daprClient.GetConfiguration("appconfigstore", new[] { "FeatureX" }); * Use Azure App Config, Consul, or any provider * Centralized updates — no redeploys * Consistent access via Dapr SDK How to enable configuration in for a Azure Container App Open Azure Portal → go to your Container App Environment. From the left pane, click on Container Apps, and choose your desired app (e.g., orders-api). In the Settings section, select Dapr. Enable Dapr toggle → switch it ON. Provide the basic Dapr settings: App ID: A unique name (e.g., orders-app). App Port: The internal port your API listens on (e.g., 8080). App Protocol: Choose HTTP or gRPC (usually HTTP). Click Save to apply. Now, under the same Container App Environment, go to Dapr Components. Click Create → select Configuration→ choose the type of configuration (e.g., configuration.azure.appconfig). #3 . Pub/Sub Enables event-driven communication between microservices without needing to know each other's endpoints. Without Dapr ❌ var client = new ServiceBusClient("<connection-string>"); var sender = client.CreateSender("order-topic"); await sender.SendMessageAsync(new ServiceBusMessage(orderJson)); * Tied to Azure Service Bus * Must manage SDKs, connections, retries * Hard to switch to another broker (Kafka, RabbitMQ) With Dapr ✅ await daprClient.PublishEventAsync("pubsub", "order-created", order); * pubsub component defined in YAML (can be Kafka, Redis Streams, etc.) * No SDK, no broker dependency * Just publish the event — Dapr handles transport & retries How to enable pub/sub in for a Azure Container App Open Azure Portal → go to your Container App Environment. From the left pane, click on Container Apps, and choose your desired app (e.g., orders-api). In the Settings section, select Dapr. Enable Dapr toggle → switch it ON. Provide the basic Dapr settings: App ID: A unique name (e.g., orders-app). App Port: The internal port your API listens on (e.g., 8080). App Protocol: Choose HTTP or gRPC (usually HTTP). Click Save to apply. Now, under the same Container App Environment, go to Dapr Components. Click Create → select Pub/Sub→ choose the type of configuration (e.g., pubsub.azure.servicebus.topics). #4 . Secret Stores Securely retrieves credentials and secrets from vaults, keeping them out of configs and code. Without Dapr ❌ var connString = Configuration["ConnectionStrings:DB"]; * Secrets stored in configs or env vars * Risk of leaks and manual rotation With Dapr ✅ var secret = await daprClient.GetSecretAsync("vault", "dbConnection"); * Fetch directly from Azure Key Vault, AWS Secrets, etc. * No secrets in configs * Secure by default, consistent across services How to enable Secret Stores in for a Azure Container App Open Azure Portal → go to your Container App Environment. From the left pane, click on Container Apps, and choose your desired app (e.g., orders-api). In the Settings section, select Dapr. Enable Dapr toggle → switch it ON. Provide the basic Dapr settings: App ID: A unique name (e.g., orders-app). App Port: The internal port your API listens on (e.g., 8080). App Protocol: Choose HTTP or gRPC (usually HTTP). Click Save to apply. Now, under the same Container App Environment, go to Dapr Components. Click Create → select Secret stores→ choose the type of configuration (e.g., secretstores.azure.keyvault). #5 . State Provides a consistent way to store and retrieve application data across services using a simple API. Without Dapr ❌ var cosmosClient = new CosmosClient(connStr); var container = cosmosClient.GetContainer("db", "state"); await container.UpsertItemAsync(order); * Direct dependency on Cosmos DB * Manual retry logic * Tight coupling to storage type With Dapr ✅ await daprClient.SaveStateAsync("statestore", "order-101", order); var data = await daprClient.GetStateAsync<Order>("statestore", "order-101"); * Plug any backend (Redis, Cosmos, PostgreSQL) * Dapr handles retries and consistency * Same code, different backend — total flexibility How to enable State in for a Azure Container App Open Azure Portal → go to your Container App Environment. From the left pane, click on Container Apps, and choose your desired app (e.g., orders-api). In the Settings section, select Dapr. Enable Dapr toggle → switch it ON. Provide the basic Dapr settings: App ID: A unique name (e.g., orders-app). App Port: The internal port your API listens on (e.g., 8080). App Protocol: Choose HTTP or gRPC (usually HTTP). Click Save to apply. Now, under the same Container App Environment, go to Dapr Components. Click Create → select State → choose the type of configuration (e.g., state.azure.cosmosdb). 🧩 Summary Think of Dapr as your invisible co-pilot for building distributed apps. It abstracts away all the repetitive plumbing — state management, pub/sub messaging, secret handling, and external bindings — letting you focus on writing features that matter. With Dapr, you don’t just write code that runs locally; you write code that just works across clouds, containers, and environments, without having to worry about wiring up retries, event delivery, or service-to-service communication manually. 🧰 Demo Source Code I've prepared complete sample on .Net core that touches all major Dapr features: * State Store * Pub/Sub * Bindings * Configuration * Secret Store You can explore it from Github-Dapr-Api Clone, run locally, and experiment — the project uses in-memory storage to keep things lightweight for testing and learning. 📚 References for Deep Dive Official Dapr Docs Dapr for .NET Developers — Microsoft Learn Dapr .NET SDK GitHubPart 3: Client-Side Multi-Agent Orchestration on Azure App Service with Microsoft Agent Framework
In Part 2 of this series, I showed you how to build sophisticated multi-agent systems on Azure App Service using Azure AI Foundry Agents—server-side managed agents that run as Azure resources. Now I want to show you another alternative that gives you full control over agent orchestration, chat history management, and provider flexibility: client-side agents using ChatClientAgent. But this alternative raises an important question: How do you choose between client-side and server-side agents? This is an important question that points to a fundamental choice in Agent Framework: client-side agents vs. server-side agents. I'm not going to go into extreme detail here; my goal for this post is to show you how to build client-side multi-agent systems with ChatClientAgent and Azure App Service, but I will highlight the key differences and trade-offs that are going through my mind when considering this to help you make an informed decision. In Part 2, I mentioned: "In my next blog post, I'll demonstrate an alternative approach using a different agent type—likely the Azure OpenAI ChatCompletion agent type—which doesn't create server-side Foundry resources. Instead, you orchestrate the agent behavior yourself while still benefiting from the Agent Framework's unified programming model." Today, I'm delivering on that promise! We're going to rebuild the same travel planner sample using ChatClientAgent—a client-side agent type that gives you complete control over orchestration, chat history, and agent lifecycle. In this post, we'll explore: ✅ Client-side agent orchestration with full workflow control ✅ ChatClientAgent architecture and implementation patterns ✅ When to choose client-side vs. server-side agents ✅ How Azure App Service supports both approaches equally well ✅ Managing chat history your way (Cosmos DB, Redis, or any storage you choose) 🔗 Full Sample Code: https://github.com/Azure-Samples/app-service-maf-openai-travel-agent-dotnet The Key Question: Who's in Charge? When building multi-agent systems with Agent Framework, you face a fundamental architectural decision. Microsoft Agent Framework supports multiple agent types, but the choice typically comes down to: Server-Side (Foundry Agents - Part 2) Azure AI Foundry manages agent lifecycle, threads, and execution Agents exist as Azure resources in your AI Project Conversation history stored in Foundry threads Built-in orchestration patterns and features Client-Side (ChatClientAgent - This Post) Your application code manages agent lifecycle and orchestration Agents are C# objects created on-demand Conversation history stored wherever you choose (Cosmos DB, Redis, etc.) You write the orchestration logic yourself Both approaches run well on Azure App Service—the platform doesn't care which agent type you use. What matters is which approach fits your requirements better. What's Different: ChatClientAgent Architecture Let's see what changes when you switch from Foundry agents to ChatClientAgent. The Same Multi-Agent Workflow Both samples implement the exact same travel planner with 6 specialized agents: Currency Converter Agent - Real-time exchange rates Weather Advisor Agent - Forecasts and packing tips Local Knowledge Agent - Cultural insights Itinerary Planner Agent - Day-by-day schedules Budget Optimizer Agent - Cost allocation Coordinator Agent - Final assembly The agents collaborate through the same 4-phase workflow: Phase 1: Parallel information gathering (Currency + Weather + Local) Phase 2: Itinerary planning Phase 3: Budget optimization Phase 4: Final assembly Same workflow, different execution model. How ChatClientAgent Works Here's the architecture stack for the client-side approach: The architecture shows: Your Application Code: TravelPlanningWorkflow orchestrating 6 ChatClientAgents with client-side chat history Microsoft.Agents.AI: ChatClientAgent wrapper adding instructions and tools Microsoft.Extensions.AI: IChatClient abstraction with Azure OpenAI implementation Azure Services: Azure OpenAI, Cosmos DB for chat history, and external APIs Key components: TravelPlanningWorkflow - Your orchestration code that coordinates agent execution ChatClientAgent - Agent Framework wrapper that adds instructions and tools to IChatClient IChatClient - Standard abstraction from Microsoft.Extensions.AI Client-Side Chat History - Dictionary storing conversation per agent (you manage this!) Azure OpenAI - Direct chat completion API calls (no AI Project endpoint needed) Cosmos DB - Your choice for chat history persistence Implementation: BaseAgent Pattern Here's how you create a ChatClientAgent in code: public abstract class BaseAgent : IAgent { protected readonly ChatClientAgent Agent; protected abstract string AgentName { get; } protected abstract string Instructions { get; } // Constructor for simple agents without tools protected BaseAgent( ILogger logger, IOptions<AgentOptions> options, IChatClient chatClient) { Agent = new ChatClientAgent(chatClient, new ChatClientAgentOptions { Name = AgentName, Instructions = Instructions }); } // Constructor for agents with tools (weather, currency APIs) protected BaseAgent( ILogger logger, IOptions<AgentOptions> options, IChatClient chatClient, ChatOptions chatOptions) { Agent = new ChatClientAgent(chatClient, new ChatClientAgentOptions { Name = AgentName, Instructions = Instructions, ChatOptions = chatOptions // Tools via AIFunctionFactory }); } public async Task<ChatMessage> InvokeAsync( IList<ChatMessage> chatHistory, CancellationToken cancellationToken = default) { var response = await Agent.RunAsync( chatHistory, thread: null, options: null, cancellationToken); return response.Messages.LastOrDefault() ?? new ChatMessage(ChatRole.Assistant, "No response generated."); } } What's happening here? You create a ChatClientAgent by wrapping an IChatClient You provide instructions (the agent's system prompt) Optionally, you provide tools via ChatOptions (using AIFunctionFactory ) When you call RunAsync , you pass the chat history yourself The agent returns a response, and you decide what to do with the chat history Compare this to Foundry agents where you create the agent once in Azure AI Foundry, and the platform manages threads and execution for you. Client-Side Chat History Management One of the biggest differences is you control the chat history: public class WorkflowState { // Each agent gets its own conversation history public Dictionary<string, List<ChatMessage>> AgentChatHistories { get; set; } = new(); public List<ChatMessage> GetChatHistory(string agentType) { if (!AgentChatHistories.ContainsKey(agentType)) { AgentChatHistories[agentType] = new List<ChatMessage>(); } return AgentChatHistories[agentType]; } } Workflow orchestration: // Phase 1: Currency Converter Agent var currencyChatHistory = state.GetChatHistory("CurrencyConverter"); currencyChatHistory.Add(new ChatMessage(ChatRole.User, $"Convert {request.Budget} {request.Currency} to local currency for {request.Destination}")); var currencyResponse = await _currencyAgent.InvokeAsync(currencyChatHistory, cancellationToken); currencyChatHistory.Add(currencyResponse); // You manage the history! // Store in workflow state for downstream agents state.AddToContext("CurrencyInfo", currencyResponse.Text ?? ""); Benefits: Store chat history in Cosmos DB, Redis, SQL, or any data store Query conversation history with your own logic Implement custom retention policies Export chat logs for analytics or compliance With Foundry agents, chat history lives in Foundry threads—you don't directly control where or how it's stored. This may be fine for many scenarios, but if you need custom storage or compliance, client-side management is powerful. Tool Integration with AIFunctionFactory External API tools (weather, currency) are registered as C# methods: // Weather Service public class NWSWeatherService : IWeatherService { [Description("Get weather forecast for a US city")] public async Task<WeatherForecast> GetWeatherAsync( [Description("City name (e.g., 'San Francisco')")] string city, [Description("State code (e.g., 'CA')")] string state, CancellationToken cancellationToken = default) { // Implementation calls NWS API } } // Register as tools with ChatClientAgent var weatherTools = AIFunctionFactory.Create(weatherService); var chatOptions = new ChatOptions { Tools = weatherTools }; var agent = new ChatClientAgent(chatClient, new ChatClientAgentOptions { Name = "WeatherAdvisor", Instructions = "Provide weather forecasts and packing recommendations...", ChatOptions = chatOptions }); The agent can now call GetWeatherAsync via function calling—same capability as Foundry agents, but configured in code instead of the portal. Why Choose Client-Side Agents (ChatClientAgent)? Here's when ChatClientAgent shines: ✅ Full Orchestration Control You write the workflow logic: // Phase 1: Run 3 agents in parallel (your code!) var currencyTask = GatherCurrencyInfoAsync(request, state, progress, cancellationToken); var weatherTask = GatherWeatherInfoAsync(request, state, progress, cancellationToken); var localTask = GatherLocalKnowledgeAsync(request, state, progress, cancellationToken); await Task.WhenAll(currencyTask, weatherTask, localTask); // Phase 2: Sequential itinerary planning (your code!) await PlanItineraryAsync(request, state, progress, cancellationToken); With Foundry agents, orchestration patterns are limited to what the platform provides. ✅ Cost-Effective No separate agent infrastructure: ChatClientAgent: Pay only for Azure OpenAI API calls Foundry Agents: Pay for Azure OpenAI + AI Project resources + agent storage For high-volume scenarios, this can add up to significant savings. ✅ DevOps-Friendly Everything in code: Agent definitions tracked in Git Testable with unit tests CI/CD pipelines deploy everything together No manual portal configuration steps Infrastructure as Code (Bicep) covers all resources ✅ Flexible Chat History Store conversations your way: Cosmos DB for global distribution and rich queries Redis for ultra-low latency caching SQL Database for complex relational queries Blob Storage for long-term archival Custom encryption and retention policies ✅ Provider Flexibility Works with any IChatClient: Azure OpenAI (this sample) OpenAI directly Local models via Ollama Azure AI Foundry model catalog Custom chat implementations Switching providers is just a configuration change—no agent re-creation needed. ✅ Multi-Agent Coordination Patterns Implement complex workflows: Parallel execution (Phase 1 in our sample) Sequential dependencies (Phase 2-4) Conditional branching based on agent responses Agent-to-agent negotiation Hierarchical supervisor patterns Custom retry logic per agent You have complete freedom to orchestrate however your scenario requires. Why Choose Server-Side Agents (Azure AI Foundry)? To be fair, Foundry agents from Part 2 have their own advantages and this post isn't about dismissing them. They are a powerful option for many scenarios. Here are some reasons to choose Foundry agents: ✅ Managed Lifecycle Platform handles the heavy lifting: Agents persist as Azure resources Threads automatically manage conversation state Runs track execution progress server-side No orchestration code to write or maintain ✅ Built-In Features Rich capabilities out of the box: File search for RAG scenarios Code interpreter for data analysis Automatic conversation threading Built-in retry and error handling ✅ Portal UI Configure without code: Create agents in Azure AI Foundry portal Test agents interactively View conversation threads and runs Adjust instructions without redeployment ✅ Less Code Simpler for basic scenarios: // Foundry Agent (Part 2 sample) var agent = await agentsClient.CreateAgentAsync( "gpt-4o", instructions: "You are a travel planning expert...", tools: new List<ToolDefinition> { new FunctionTool(...) }); var thread = await agentsClient.CreateThreadAsync(); var run = await agentsClient.CreateRunAsync(thread.Id, agent.Id); No need to manage chat history, orchestration logic, or tool registration in code. When to Choose Which Approach Here's my take on a decision guide. This isn't exhaustive, but it covers key considerations. Others may disagree based on their priorities, but this is how I think about it: Requirement ChatClientAgent Foundry Agents Complex multi-agent workflows ✅ Full control ⚠️ Limited patterns Custom chat history storage ✅ Any data store ❌ Foundry threads only Cost optimization ✅ LLM calls only ⚠️ + Infrastructure Code-first DevOps ✅ Everything in Git ⚠️ Portal config needed Provider flexibility ✅ Any IChatClient ⚠️ Azure only Built-in RAG (file search) ❌ DIY ✅ Built-in Portal UI for testing ❌ Code only ✅ Full UI Quick prototypes ⚠️ More code ✅ Fast setup Learning curve ⚠️ More concepts ✅ Guided setup Use ChatClientAgent when: You need complex multi-agent coordination Cost optimization is important You want full control over orchestration Code-first DevOps is a priority You need custom chat history management Use Foundry Agents when: Simple single-agent or basic multi-agent scenarios You want built-in RAG and file search Portal-based configuration is preferred Quick prototyping and experimentation Managed infrastructure over custom code Azure App Service: Perfect for Both Here's the great part: Azure App Service supports both approaches equally well. The Same Architecture Both samples use identical infrastructure. What's the same: ✅ Async request-reply pattern (202 Accepted → poll status) ✅ Service Bus for reliable message delivery ✅ Cosmos DB for task state with 24-hour TTL ✅ WebJob for background processing ✅ Managed Identity for authentication ✅ Premium App Service tier for Always On What's different: ChatClientAgent: Azure OpenAI endpoint directly ( https://ai-xyz.openai.azure.com/ ) Foundry Agents: AI Project endpoint ( https://ai-xyz.services.ai.azure.com/api/projects/proj-xyz ) ChatClientAgent: Chat history in Cosmos DB (your control) Foundry Agents: Chat history in Foundry threads (platform managed) Azure App Service doesn't care which you choose. It just runs your .NET code, processes messages from Service Bus, and stores state in Cosmos DB. The agent execution model is an implementation detail. You can easily switch between approaches without changing your hosting platform, and even use a hybrid approach if desired. Get Started Today Ready to try client-side multi-agent orchestration on Azure App Service? 🔗 GitHub Repository: https://github.com/Azure-Samples/app-service-maf-openai-travel-agent-dotnet The repository includes: ✅ Complete .NET 9 source code with 6 specialized ChatClientAgents ✅ Infrastructure as Code (Bicep) for one-command deployment ✅ Web UI with real-time progress tracking ✅ Comprehensive README and architecture documentation ✅ External API integrations (weather, currency) ✅ Client-side chat history management with Cosmos DB Deploy in Minutes # Clone the repository git clone https://github.com/Azure-Samples/app-service-maf-openai-travel-agent-dotnet.git cd app-service-maf-openai-travel-agent-dotnet # Login to Azure azd auth login # Provision infrastructure and deploy the API azd up This provisions: Azure App Service (P0v4 Premium Windows) Azure Service Bus (message queue) Azure Cosmos DB (state + chat history storage) Azure AI Services (AI Services resource) GPT-4o model deployment (GlobalStandard 50K TPM) Then manually deploy the WebJob following the README instructions. Compare with Part 2 Want to see the differences firsthand? Deploy both samples: Part 2 - Server-Side Foundry Agents: 🔗 https://github.com/Azure-Samples/app-service-maf-workflow-travel-agent-dotnet Part 3 - Client-Side ChatClientAgent (this post): 🔗 https://github.com/Azure-Samples/app-service-maf-openai-travel-agent-dotnet Same travel planner, same workflow, same results—different execution model. Try both and see which fits your needs! Key Takeaways ✅ Microsoft Agent Framework offers choice: Client-side (ChatClientAgent) vs. Server-side (Foundry Agents) ✅ ChatClientAgent gives you full control: Orchestration, chat history, agent lifecycle—you manage it all in code ✅ Foundry Agents give you convenience: Managed infrastructure, built-in features, portal UI—let the platform handle the details ✅ Azure App Service supports both equally: Same async request-reply pattern, same WebJob architecture, same infrastructure ✅ Pick the right tool for your needs: Complex coordination and cost control → ChatClientAgent. Simple scenarios and managed infrastructure → Foundry Agents. Whether you choose client-side or server-side agents, Azure App Service provides the perfect platform for long-running AI workloads—reliable, scalable, and fully managed. What's Next? This completes our three-part series on building AI agents with Microsoft Agent Framework on Azure App Service: Part 1: Introduction to Agent Framework and async request-reply pattern Part 2: Multi-agent systems with server-side Foundry Agents Part 3 (this post): Client-side multi-agent orchestration with ChatClientAgent What would you like to see next? More advanced orchestration patterns? Integration with other Azure services? Let me know in the comments what you'd like to learn about next and I'll do my best to deliver!271Views1like1CommentPart 2: Build Long-Running AI Agents on Azure App Service with Microsoft Agent Framework
Last week, I shared how to build long-running AI agents on Azure App Service with Microsoft Agent Framework. If you haven't seen that post yet, I would recommend starting there as this post builds on the foundations introduced there including getting started with Microsoft Agent Framework. The response so far was great, and one comment in particular stood out: "Thanks for the example. Nice job! Just curious (I still have to investigate the ins and outs of MAF) but why didn't you use the workflow pattern/classes of MAF? I thought that was meant to be the way to connect agents and let them cooperate (even in long running job situations)." — Michel_Schep Great question! You're absolutely right in questioning this—the initial sample I created was designed to demonstrate the async request-reply architecture for handling long-running operations on App Service with a single agent. Today, we're taking the next step: a multi-agent workflow sample that addresses exactly what you asked about and is the next leap in building agentic apps in the cloud. In this post, we'll explore: ✅ Building multi-agent systems with specialized, collaborating AI agents ✅ When to create agents in code vs. using Azure AI Foundry portal ✅ Orchestrating complex workflows with parallel and sequential execution ✅ Real-world patterns for production multi-agent applications 🔗 Full Sample Code: https://github.com/Azure-Samples/app-service-maf-workflow-travel-agent-dotnet Why Multi-Agent Systems? The single-agent pattern I showed last week works great for straightforward tasks. But real-world AI applications often need specialized expertise across different domains. That's where multi-agent systems shine. The Travel Planning Challenge Imagine planning a trip to Tokyo. You need: Currency expertise for budget conversion and exchange rates Weather knowledge for packing recommendations and seasonal planning Local insights about customs, culture, and etiquette Itinerary skills to create day-by-day schedules Budget optimization to allocate funds across categories Coordination to assemble everything into a cohesive plan With a single agent handling all of this, you get a "jack of all trades, master of none" situation. The prompts become complex, the agent loses focus, and results can be inconsistent. Enter Multi-Agent Workflows Instead of one generalist agent, we can create 6 or more specialized agents, each with a focused responsibility: Currency Converter Agent - Real-time exchange rates (Frankfurter API integration) Weather Advisor Agent - Forecasts and packing tips (National Weather Service API) Local Knowledge Agent - Cultural insights and customs Itinerary Planner Agent - Day-by-day activity scheduling Budget Optimizer Agent - Cost allocation and optimization Coordinator Agent - Final assembly and formatting Each agent has: 🎯 Clear, focused instructions specific to its domain 🛠️ Specialized tools (weather API, currency API) 📊 Defined inputs and outputs for predictable collaboration ✅ Testable behavior that's easy to validate Additionally, if you wanted to extend this even further, you could create even more agents and give some of your specialist agents even more knowledge by connecting additional tools and MCP servers. The possibilities are endless, and I hope this post inspires you to start thinking about what you can build and achieve. What Makes This Possible? Microsoft Agent Framework All of this is powered by Microsoft Agent Framework—a comprehensive platform for building, deploying, and managing AI agents that goes far beyond simple chat completions. Understanding Agent Framework vs. Other Approaches Before diving into the details, it's important to understand what Agent Framework is. Unlike frameworks like Semantic Kernel where you orchestrate AI behavior entirely in your application code with direct API calls, Agent Framework provides a unified abstraction for working with AI agents across multiple backend types. Agent Framework supports several agent types (see documentation): Simple agents based on inference services - Agents built on any IChatClient implementation, including: Azure OpenAI ChatCompletion Azure AI Foundry Models ChatCompletion OpenAI ChatCompletion and Responses Any other Microsoft.Extensions.AI.IChatClient implementation Server-side managed agents - Agents that live as Azure resources: Azure AI Foundry Agent (used in this sample) OpenAI Assistants Custom agents - Fully custom implementations of the AIAgent base class Proxy agents - Connections to remote agents via protocols like A2A In this sample, we use Azure AI Foundry Agents—the server-side managed agent type. When you use these Foundry agents: Agents are Azure resources - They exist on the server-side in Azure AI Foundry, not just as code patterns Execution happens on Foundry - Agent runs execute on Azure's infrastructure with built-in state management You get structured primitives - Agents, Threads, and Runs are first-class concepts with their own lifecycles Server-side persistence - Conversation history and context are managed by the platform This server-side approach is convenient because the platform manages state and execution for you. However, other agent types (like ChatCompletion-based agents) give you more control over orchestration while still benefiting from the unified Agent Framework programming model. In my next blog post, I'll demonstrate an alternative approach using a different agent type—likely the Azure OpenAI ChatCompletion agent type—which doesn't create server-side Foundry resources. Instead, you orchestrate the agent behavior yourself while still benefiting from the Agent Framework's unified programming model. If you're new to Agent Framework, here's what makes it special: 🔄 Persistent Agents: Server-side agents that maintain context across multiple interactions, not just one-off API calls 💬 Conversation Threads: Organized conversation history and state management that persists across agent runs 🎯 Agent Runs: Structured execution with progress tracking and lifecycle management—you can monitor exactly what your agents are doing 🔁 Multi-Turn Interactions: Complex workflows with iterative AI processing, where agents can refine and improve their outputs 🛠️ Tool Integration: Extensible function calling and integration capabilities—agents can call external APIs, execute code, and interact with real-world systems In our sample, Agent Framework handles: Creating and managing 6 specialized agents programmatically Maintaining conversation context as agents collaborate Tracking execution progress across workflow phases Managing agent lifecycle (creation, execution, cleanup) Integrating external APIs (weather, currency) seamlessly The beauty of Agent Framework is that it makes complex multi-agent orchestration feel natural. You focus on defining what your agents should do, and the framework handles the infrastructure, state management, and execution—all running on Azure AI Foundry with enterprise-grade reliability. The Multi-Agent Workflow Here's how these agents collaborate to create a comprehensive travel plan in the sample I put together: Execution Phases Phase 1: Parallel Information Gathering (10-40%) Currency, Weather, and Local Knowledge agents execute simultaneously No dependencies = maximum performance Results stored in workflow state for downstream agents Phase 2: Itinerary Planning (40-70%) Itinerary Planner uses context from all Phase 1 agents Weather data influences activity recommendations Local knowledge shapes cultural experiences Currency conversion informs budget-conscious choices Phase 3: Budget Optimization (70-90%) Budget Optimizer analyzes the proposed itinerary Allocates funds across categories (lodging, food, activities, transport) Provides cost-saving tips without compromising the experience Phase 4: Final Assembly (90-100%) Coordinator compiles all agent outputs Formats comprehensive travel plan with tips Returns structured, user-friendly itinerary Benefits of This Architecture ✅ Faster Execution: Parallel agents complete in ~30% less time ✅ Better Quality: Specialized agents produce more focused, accurate results ✅ Easy Debugging: Each agent's contribution is isolated and traceable ✅ Maintainable: Update one agent without affecting others ✅ Scalable: Add new agents (flight booking, hotel search) without refactoring ✅ Testable: Validate each agent independently with unit tests The Complete Architecture Here's how everything fits together on Azure App Service: This architecture builds on the async request-reply pattern from our previous post, adding: ✅ Multi-agent orchestration in the background worker ✅ Parallel execution of independent agents for performance ✅ Code-generated agents for production-ready DevOps ✅ External API integration (weather, currency) for real-world data ✅ Progress tracking across workflow phases (10% → 40% → 70% → 100%) Get Started Today Ready to build your own multi-agent workflows on Azure App Service? Try out the sample today! 🔗 GitHub Repository: https://github.com/Azure-Samples/app-service-maf-workflow-travel-agent-dotnet The repository includes: ✅ Complete .NET 9 source code with 6 specialized agents ✅ Infrastructure as Code (Bicep) for one-command deployment ✅ Complete web UI with real-time progress tracking ✅ Comprehensive README with architecture documentation ✅ External API integrations (weather, currency) Deploy in Minutes git clone https://github.com/Azure-Samples/app-service-maf-workflow-travel-agent-dotnet.git cd app-service-maf-workflow-travel-agent-dotnet azd auth login azd up The azd up command provisions: Azure App Service (P0v4 Premium) Azure Service Bus (message queue for async processing) Azure Cosmos DB (state storage with 24-hour TTL) Azure AI Foundry (AI Services + Project for Agent Framework) GPT-4o model deployment (GlobalStandard 50K TPM) Then manually deploy the WebJob following the README instructions. What's Next? Extend This Pattern This sample demonstrates production-ready patterns you can extend: 🛠️ Add More Specialized Agents Flight Expert Agent - Search and compare flight prices Hotel Specialist Agent - Find accommodations based on preferences Activity Planner Agent - Book tours, restaurants, events Transportation Agent - Plan routes, transit passes, car rentals 🤝 Implement Agent-to-Agent Communication Agents negotiate conflicting recommendations Hierarchical structures with supervisor agents Voting mechanisms for decision-making 🧠 Add Advanced Capabilities RAG (Retrieval Augmented Generation) for destination-specific knowledge bases Memory to remember user preferences across trips Vision models to analyze travel photos and recommend similar destinations Multi-language support for international travelers 📊 Production Enhancements Authentication - Microsoft Entra AD for user identity Application Insights - Distributed tracing and custom metrics VNet Integration - Private endpoints for security Auto-Scaling - Scale workers based on queue depth Webhooks - Notify users when their travel plan is ready Key Takeaways ✅ Multi-agent systems provide specialized expertise and better results than single generalist agents ✅ Azure App Service provides a simple, reliable platform for long-running multi-agent workflows ✅ Async request-reply pattern with Service Bus + Cosmos DB ensures scalability and resilience ✅ External API integration makes agents more useful with real-world data ✅ Parallel execution of independent agents dramatically improves performance Whether you're building travel planners, document processors, research assistants, or other AI-powered applications, multi-agent workflows on Azure App Service give you the flexibility and sophistication you need. Learn More Microsoft Agent Framework Documentation - Complete guide to Agent Framework Original Blog Post - Single-agent async patterns on App Service Azure App Service Best Practices - Production deployment patterns Async Request-Reply Pattern - Architecture guidance Azure App Service WebJobs - Background processing documentation We Want to Hear From You! Thanks again to Michel_Schep for the great question that inspired this follow-up sample! Have you built multi-agent systems with Agent Framework? Are you using Azure App Service to host your AI and intelligent apps? We'd love to hear about your experience in the comments below. Questions about multi-agent workflows on App Service? Drop a comment and our team will help you get started. Happy building! 🚀461Views1like0CommentsHow to respond to upgrade-assistant cli prompts?
Bear with me please, I have not touched this stuff in a few years. Trying to upgrade Xamarin project to .Net Maui using CLI (latest version of VS 2022 cannot find the Upgrade assistant extension). I have downloaded the upgrade-assistant, but am not familiar with how to reply\how to make a selection of what is displayed after I enter "upgrade-assistant upgrade project-name" (it does not execute and instead displays information...see attached image). Is what is being display telling me that I need to ctrl-c out of this and re-enter the command using what they are showing? Please give me an example of using the Side-by-side upgrade. Is is also telling me that I need to do something with NuGet also on the command line? Thanks.42Views0likes0CommentsCapture .NET Profiler Trace on the Azure App Service platform
Summary The article provides guidance on using the .NET Profiler Trace feature in Microsoft Azure App Service to diagnose performance issues in ASP.NET applications. It explains how to configure and collect the trace by accessing the Azure Portal, navigating to the Azure App Service, and selecting the "Collect .NET Profiler Trace" feature. Users can choose between "Collect and Analyze Data" or "Collect Data only" and must select the instance to perform the trace on. The trace stops after 60 seconds but can be extended up to 15 minutes. After analysis, users can view the report online or download the trace file for local analysis, which includes information like slow requests and CPU stacks. The article also details how to analyze the trace using Perf View, a tool available on GitHub, to identify performance issues. Additionally, it provides a table outlining scenarios for using .NET Profiler Trace or memory dumps based on various factors like issue type and symptom code. This tool is particularly useful for diagnosing slow or hung ASP.NET applications and is available only in Standard or higher SKUs with the Always On setting enabled. In this article How to configure and collect the .NET Profiler Trace How to download the .NET Profiler Trace How to analyze a .NET Profiler Trace When to use .NET Profilers tracing vs. a memory dump The tool is exceptionally suited for scenarios where an ASP.NET application is performing slower than expected or gets hung. As shown in Figure 1, this feature is available only in Standard or higher Stock Keeping Unit (SKU) and Always On is enabled. If you try to configure .NET Profiler Trace, without both configurations the following messages is rendered. Azure App Service Diagnose and solve problems blade in the Azure Portal error messages Error – This tool is supported only on Standard, Premium, and Isolated Stock Keeping Unit (SKU) only with AlwaysOn setting enabled to TRUE. Error – We determined that the web app is not "Always-On" enabled and diagnostic does not work reliably with Auto Heal. Turn on the Always-On setting by going to the Application Settings for the web app and then run these tools. How to configure and collect the .NET Profiler Trace To configure a .NET Profiler Trace access the Azure Portal and navigate to the Azure App Service which is experiencing a performance issue. Select Diagnose and solve problems and then the Diagnostic Tools tile. Azure App Service Diagnose and solve problems blade in the Azure Portal Select the "Collect .NET Profiler Trace" feature on the Diagnostic Tools blade and the following blade is rendered. Notice that you can only select Collect and Analyze Data or Collect Data only. Choose the one you prefer but do consider having the feature perform the analysis. You can download the trace for offline analysis if necessary. Also notice that you need to **select the instance** on which you want to perform the trace. In the scenario, there is only one, so the selection is simple. However, if your app runs on multiple instances, either select them all or if you identify a specific instance which is behaving slowly, select only that one. You realize the best results if you can isolate a single instance enough so that the request you sent is the only one received on that instance. However, in a scenario where the request or instance is not known, the trace adds value and insights. Adding a thread report provides list of all the threads in the process is also collected at the end of the profiler trace. The thread report is useful especially if you are troubleshooting hung processes, deadlocks, or requests taking more than 60 seconds. This pauses your process for a few seconds until the thread dump is generated. CAUTION: a thread report is NOT recommended if you are experiencing High CPU in your application, you may experience issues during trace analysis if CPU consumption is high. Azure App Service Diagnose and solve problems, Collect .NET Profiler Trace blade in the Azure Portal There are a few points called out in the previous image which are important to read and consider. Specifically the .NET Profiler Trace will stop after 60 seconds from the time that it is started. Therefore, if you can reproduce the issue, have the reproduction steps ready before you start the profiling. If you are not able to reproduce the issue, then you may need to run the trace a few times until the slowness or hang occurs. The collection time can be increased up to 15 minutes (900 seconds) by adding an application setting named IIS_PROFILING_TIMEOUT_IN_SECONDS with a value of up to 900. After selecting the instance to perform the trace on, press the Collect Profiler Trace button, wait for the profiler to start as seen here, then reproduce the issue or wait for it to occur. Azure App Service Diagnose and solve problems, Collect .NET Profiler Trace status starting window After the issue is reproduced the .NET Profiler Trace continues to the next step of stopping as seen here. Azure App Service Diagnose and solve problems, Collect .NET Profiler Trace status stopping window Once stopped, the process continues to the analysis phase if you selected the Collect and Analyze Data option, as seen in the following image, otherwise you are provided a link to download the file for analysis on your local machine. The analysis can take some time, so be patient. Azure App Service Diagnose and solve problems, Collect .NET Profiler Trace status analyzing window After the analysis is complete, you can either view the Analysis online or download the trace file for local development. How to download the .NET Profiler Trace Once the analysis is complete you can view the report by selecting the link in the Reports column, as seen here. Azure App Service Diagnose and solve problems, Collect .NET Profiler Trace status complete window Clicking on the report you see the following. There is some useful information in this report, like a list of slow requests, Failed Requests, Thread Call stacks, and CPU stacks. Also shown is a breakdown of where the time was spent during the response generation into categories like Application Code, Platform, and Network. In this case, all the time is spent in the Application code. Azure App Service Diagnose and solve problems, Collect .NET Profiler Trace review the Report To find out specifically where in the Application Code this request performs the analysis of the trace locally. How to analyze a .NET Profiler Trace After downloading the network trace by selecting the link in the Data column, you can use a tool named Perf View which is downloadable on GitHub here. Begin by opening Perf View and double-clicking on the ".DIAGSESSION" file, after some moments expand it to render the Event Trace Log (ETL) file, as shown here. Analyze Azure App Service .NET Profiler Trace with Perf View Double-click on the Thread Time (with startStop Activities) Stacks which open up a new window similar to shown next. If your App Service is configured as out-of-process select the dotnet process which is associated to your app code. If your App Service is in-process select the w3wp process. Analyze Azure App Service .NET Profiler Trace with Perf View, dotnet out-of-process Double-click on dotnet and another window is rendered, as shown here. From the previous image, .NET Profiler Trace reviews the Report, it is clear where slowness is coming from, find that in the Name column or search for it by entering the page name into the Find text box. Analyze Azure App Service .NET Profiler Trace with Perf View, dotnet out-of-process, method, and class discovery Once found right-click on the row and select Drill Into from the pop-up menu, shown here. Select the Call Tree tab and the reason for the issue renders showing which request was performing slow. Analyze Azure App Service .NET Profiler Trace with Perf View, dotnet out-of-process, root cause This example is relatively. As you analyze more performance issues using Perf View to analyze a .NET Profiler Trace your ability to find the root cause of more complicated performance issues can be realized. When to use .NET Profilers tracing vs. a memory dump That same issue is seen in a memory dump, however there are some scenarios where a .NET Profile trace would be best. Here is a table, Table 1, which describes scenarios for when to capture a .NET profile trace or to capture a memory dump. Issue Type Symptom Code Symptom Stack Startup Issue Intermittent Scenario Performance 200 Requests take 500 ms to 2.5 seconds, or takes <= 60 seconds ASP.NET/ASP.NET Core No No Profiler Performance 200 Requests take > 60 seconds & < 230 seconds ASP.NET/ASP.NET Core No No Dump Performance 502.3/500.121/503 Requests take >=120 to <= 230 seconds ASP.NET No No Dump, Profiler Performance 502.3/500.121/503 Requests timing out >=230 ASP.NET/ASP.NET Core Yes/No Yes/No Dump Performance 502.3/500.121/503 App hangs or deadlocks (ex: due to async anti-pattern) ASP.NET/ASP.NET Core Yes/No Yes/No Dump Performance 502.3/500.121/503 App hangs on startup (ex: caused by nonasync deadlock issue) ASP.NET/ASP.NET Core No Yes/No Dump Performance 502.3/500.121 Request timing out >=230 (time out) ASP.NET/ASP.NET Core No No Dump Availability 502.3/500.121/503 High CPU causing app downtime ASP.NET No No Profiler, Dump Availability 502.3/500.121/503 High Memory causing app downtime ASP.NET/ASP.NET Core No No Dump Availability 500.0[121]/503 SQLException or Some Exception causes app downtime ASP.NET No No Dump, Profiler Availability 500.0[121]/503 App crashing due to fatal exception at native layer ASP.NET/ASP.NET Core Yes/No Yes/No Dump Availability 500.0[121]/503 App crashing due to exit code (ex: 0xC0000374) ASP.NET/ASP.NET Core Yes/No Yes/No Dump Availability 500.0 App begin nonfatal exceptions (during a context of a request) ASP.NET No No Profiler, Dump Availability 500.0 App begin nonfatal exceptions (during a context of a request) ASP.NET/ASP.NET Core No Yes/No Dump Table 1, when to capture a .NET Profiler Trace or a Memory Dump on Azure App Service, Diagnose and solve problems Use this list as a guide to help decide how to approach the solving of performance and availability applications problems which are occurring in your application source code. Here are some descriptions regarding the column heading. - Issues Type – Performance means that a request to the app is responding or processing the response but not at a speed in which it is expected to. Availability means that the request is failing or consuming more resources than expected. - Symptom Code – the HTTP Status and/or sub status which is returned by the request. - Symptom – a description of the behavior experienced while engaging with the application. - Stack – this table targets .NET, specifically ASP.NET, and ASP.NET Core applications. - Startup Issue – if "No" then the Scenario can or should be used, "No" represents that the issue is not at startup. If "Yes/No" it means the Scenario is useful for troubleshooting startup issues. - Intermittent – if "No" then the Scenario can or should be used, "No" means the issue is not intermittent or that it can be reproduced. If "Yes/No" it means the Scenario is useful if the issue happens randomly or cannot be reproduced. Meaning that the tool can be set to trigger on a specific event or left running for a specific amount of time until the exception happens. - Scenario – "Profiler" means that the collection of a .NET Profiler Trace would be recommended. "Dump" means that a memory dump would be your best option. If both are provided, then both can be useful when the given symptoms and system codes are present. You might find the videos in Table 2 useful which instruct you how to collect and analyze a memory dump or .NET Profiler Trace. Product Stack Hosting Symptom Capture Analyze Scenario App Service Windows in High CPU link link Dump App Service Windows in High Memory link link Dump App Service Windows in Terminate link link Dump App Service Windows in Hang link link Dump App Service Windows out High CPU link link Dump App Service Windows out High Memory link link Dump App Service Windows out Terminate link link Dump App Service Windows out Hang link link Dump App Service Windows in High CPU link link Dump Function App Windows in High Memory link link Dump Function App Windows in Terminate link link Dump Function App Windows in Hang link link Dump Function App Windows out High CPU link link Dump Function App Windows out High Memory link link Dump Function App Windows out Terminate link link Dump Function App Windows out Hang link link Dump Azure WebJob Windows in High CPU link link Dump App Service Windows in High CPU link link .NET Profiler App Service Windows in Hang link link .NET Profiler App Service Windows in Exception link link .NET Profiler App Service Windows out High CPU link link .NET Profiler App Service Windows out Hang link link .NET Profiler App Service Windows out Exception link link .NET Profiler Table 2, short video instructions on capturing and analyzing dumps and profiler traces Here are a few other helpful videos for troubleshooting Azure App Service Availability and Performance issues: View Application EventLogs Azure App Service Add Application Insights To Azure App Service Prior to capturing and analyzing memory dumps, consider viewing this short video: Setting up WinDbg to analyze Managed code memory dumps and this blog post titled: Capture memory dumps on the Azure App Service platform. Question & Answers - Q: What are the prerequisites for using the .NET Profiler Trace feature in Azure App Service? A: To use the .NET Profiler Trace feature in Azure App Service, the application must be running on a Standard or higher Stock Keeping Unit (SKU) with the Always On setting enabled. If these conditions are not met, the tool will not function, and error messages will be displayed indicating the need for these configurations. - Q: How can you extend the default collection time for a .NET Profiler Trace beyond 60 seconds? A: The default collection time for a .NET Profiler Trace is 60 seconds, but it can be extended up to 15 minutes (900 seconds) by adding an application setting named IIS_PROFILING_TIMEOUT_IN_SECONDS with a value of up to 900. This allows for a longer duration to capture the necessary data for analysis. - Q: When should you use a .NET Profiler Trace instead of a memory dump for diagnosing performance issues in an ASP.NET application? A: A .NET Profiler Trace is recommended for diagnosing performance issues where requests take between 500 milliseconds to 2.5 seconds or less than 60 seconds. It is also useful for identifying high CPU usage causing app downtime. In contrast, a memory dump is more suitable for scenarios where requests take longer than 60 seconds, the application hangs or deadlocks, or there are issues related to high memory usage or app crashes due to fatal exceptions. Keywords Microsoft Azure, Azure App Service, .NET Profiler Trace, ASP.NET performance, Azure debugging tools, .NET performance issues, Azure diagnostic tools, Collect .NET Profiler Trace, Analyze .NET Profiler Trace, Azure portal, Performance troubleshooting, ASP.NET application, Slow ASP.NET app, Azure Standard SKU, Always On setting, Memory dump vs profiler trace, Perf View analysis, Azure performance diagnostics, .NET application profiling, Diagnose ASP.NET slowness, Azure app performance, High CPU usage ASP.NET, Azure app diagnostics, .NET Profiler configuration, Azure app service performance2KViews3likes1Comment