Building reliable AI systems requires modular, stateful coordination and deterministic workflows that enable agents to collaborate seamlessly. The Microsoft Agent Framework provides these foundations, with memory, tracing, and orchestration built in.
This implementation demonstrates four multi-agentic patterns — Single Agent, Handoff, Reflection, and Magentic Orchestration — showcasing different interaction models and collaboration strategies. From lightweight domain routing to collaborative planning and self-reflection, these patterns highlight the framework’s flexibility.
At the core is Model Context Protocol (MCP), connecting agents, tools, and memory through a shared context interface. Persistent session state, conversation thread history, and checkpoint support are handled via Cosmos DB when configured, with an in-memory dictionary as a default fallback. This setup enables dynamic pattern swapping, performance comparison, and traceable multi-agent interactions — all within a unified, modular runtime.
Business Scenario: Contoso Customer Support Chatbot
Contoso’s chatbot handles multi-domain customer inquiries like billing anomalies, promotion eligibility, account locks, and data usage questions. These require combining structured data (billing, CRM, security logs, promotions) with unstructured policy documents processed via vector embeddings.
Using MCP, the system orchestrates tool calls to fetch real-time structured data and relevant policy content, ensuring policy-aligned, auditable responses without exposing raw databases. This enables the assistant to explain anomalies, recommend actions, confirm eligibility, guide account recovery, and surface risk indicators—reducing handle time and improving first-contact resolution while supporting richer multi-agent reasoning.
Architecture & Core Concepts
The Contoso chatbot leverages the Microsoft Agent Framework to deliver a modular, stateful, and workflow-driven architecture. At its core, the system consists of:
- Base Agent: All agent patterns—single agent, reflection, handoff and magentic orchestration—inherit from a common base class, ensuring consistent interfaces for message handling, tool invocation, and state management.
- Backend: A FastAPI backend manages session routing, agent execution, and workflow orchestration.
- Frontend: A React-based UI (or Streamlit alternative) streams responses in real-time and visualizes agent reasoning and tool calls.
Modular Runtime and Pattern Swapping
One of the most powerful aspects of this implementation is its modular runtime design. Each agentic pattern—Single, Reflection, Handoff, and Magnetic—plugs into a shared execution pipeline defined by the base agent and MCP integration. By simply updating the .env configuration (e.g., agent_module=handoff), developers can swap in and out entire coordination strategies without touching the backend, frontend, or memory layers.
This makes it easy to compare agent styles side by side, benchmark reasoning behaviors, and experiment with orchestration logic—all while maintaining a consistent, deterministic runtime. The same MCP connectors, FastAPI backend, and Cosmos/in-memory state management work seamlessly across every pattern, enabling rapid iteration and reliable evaluation.
# Dynamic agent pattern loading
agent_module_path = os.getenv("AGENT_MODULE")
agent_module = __import__(agent_module_path, fromlist=["Agent"])
Agent = getattr(agent_module, "Agent")
# Common MCP setup across all patterns
async def _create_tools(self, headers: Dict[str, str]) -> List[MCPStreamableHTTPTool] | None:
if not self.mcp_server_uri:
return None
return [MCPStreamableHTTPTool(
name="mcp-streamable",
url=self.mcp_server_uri,
headers=headers,
timeout=30,
request_timeout=30,
)]
Memory & State Management
State management is critical for multi-turn conversations and cross-agent workflows. The system supports two out-of-the-box options:
- Persistent Storage (Cosmos DB)
- Acts as the durable, enterprise-ready backend.
- Stores serialized conversation threads and workflow checkpoints keyed by tenant and session ID.
- Ensures data durability and auditability across restarts.
- In-Memory Session Store
- Default fallback when Cosmos DB credentials are not configured.
- Maintains ephemeral state per session for fast prototyping or lightweight use cases.
All patterns leverage the same thread-based state abstraction, enabling:
- Session isolation: Each user session maintains its own state and history.
- Checkpointing: Multi-agent workflows can snapshot shared and executor-local state at any point, supporting pause/resume and fault recovery.
- Model Context Protocol (MCP): Acts as the connector between agents and tools, standardizing how data is fetched and results are returned to agents, whether querying structured databases or unstructured knowledge sources.
Core Principles
Across all patterns, the framework emphasizes:
- Modularity: Components are interchangeable—agents, tools, and state stores can be swapped without disrupting the system.
- Stateful Coordination: Multi-agent workflows coordinate through shared and local state, enabling complex reasoning without losing context.
- Deterministic Workflows: While agents operate autonomously, the workflow layer ensures predictable, auditable execution of multi-agent tasks.
- Unified Execution: From single-agent Q&A to complex Magentic orchestrations, every agent follows the same execution lifecycle and integrates seamlessly with MCP and the state store.
Multi-Agent Patterns: Workflow and Coordination
With the architecture and core concepts established, we can now explore the agentic patterns implemented in the Contoso chatbot. Each pattern builds on the base agent and MCP integration but differs in how agents orchestrate tasks and communicate with one another to handle multi-domain customer queries.
In the sections that follow, we take a deeper dive into each pattern’s workflow and examine the under-the-hood communication flows between agents:
- Single Agent – A simple, single-domain agent handling straightforward queries.
- Reflection Agent – Allows agents to introspect and refine their outputs.
- Handoff Pattern – Routes conversations intelligently to specialized agents across domains.
- Magentic Orchestration – Coordinates multiple specialist agents for complex, parallel tasks.
For each pattern, the focus will be on how agents communicate and coordinate, showing the practical orchestration mechanisms in action.
Single Intelligent Agent
The Single Agent Pattern represents the simplest orchestration style within the framework. Here, a single autonomous agent handles all reasoning, decision-making, and tool interactions directly — without delegation or multi-agent coordination.
When a user submits a request, the single agent processes the query using all tools, memory, and data sources available through the Model Context Protocol (MCP). It performs retrieval, reasoning, and response composition in a single, cohesive loop.
Communication Flow:
- User Input → Agent: The user submits a question or command.
- Agent → MCP Tools: The agent invokes one or more tools (e.g., vector retrieval, structured queries, or API calls) to gather relevant context and data.
- Agent → User: The agent synthesizes the tool outputs, applies reasoning, and generates the final response to the user.
- Session Memory: Throughout the exchange, the agent stores conversation history and extracted entities in the configured memory store (in-memory or Cosmos DB).
Key Communication Principles:
- Single Responsibility: One agent performs both reasoning and action, ensuring fast response times and simpler state management.
- Direct Tool Invocation: The agent has direct access to all registered tools through MCP, enabling flexible retrieval and action chaining.
- Stateful Execution: The session memory preserves dialogue context, allowing the agent to maintain continuity across user turns.
- Deterministic Behavior: The workflow is fully predictable — input, reasoning, tool call, and output occur in a linear sequence.
Reflection pattern
The Reflection Pattern introduces a lightweight, two-agent communication loop designed to improve the quality and reliability of responses through structured self-review.
In this setup, a Primary Agent first generates an initial response to the user’s query. This draft is then passed to a Reviewer Agent, whose role is to critique and refine the response—identifying gaps, inaccuracies, or missed context. Finally, the Primary Agent incorporates this feedback and produces a polished final answer for the user.
This process introduces one round of reflection and improvement without adding excessive latency, balancing quality with responsiveness.
Communication Flow:
- User Input → Primary Agent: The user submits a query.
- Primary Agent → Reviewer Agent: The primary generates an initial draft and passes it to the reviewer.
- Reviewer Agent → Primary Agent: The reviewer provides feedback or suggested improvements.
- Primary Agent → User: The primary revises its response and sends the refined version back to the user.
Key Communication Principles:
- Two-Stage Dialogue: Structured interaction between Primary and Reviewer ensures each output undergoes quality assurance.
- Focused Review: The Reviewer doesn’t recreate answers—it critiques and enhances, reducing redundancy.
- Stateful Context: Both agents operate over the same shared memory, ensuring consistency between draft and revision.
- Deterministic Flow: A single reflection round guarantees predictable latency while still improving answer quality.
- Transparent Traceability: Each step—initial draft, feedback, and final output—is logged, allowing developers to audit reasoning or assess quality improvements over time.
In practice, this pattern enables the system to reason about its own output before responding, yielding clearer, more accurate, and policy-aligned answers without requiring multiple independent retries.
Handoff Pattern
When a user request arrives, the system first routes it through an Intent Classifier (or triage agent) to determine which domain specialist should handle the conversation. Once identified, control is handed off directly to that Specialist Agent, which uses its own tools, domain knowledge, and state context to respond.
This specialist continues to handle the user interaction as long as the conversation stays within its domain.
If the user’s intent shifts — for example, moving from billing to security — the conversation is routed back to the Intent Classifier, which re-assigns it to the correct specialist agent.
This pattern reduces latency and maintains continuity by minimizing unnecessary routing. Each handoff is tracked through the shared state store, ensuring seamless context carry-over and full traceability of decisions.
Key Communication Principles:
- Dynamic Routing: The Intent Classifier routes user input to the right specialist domain.
- Domain Persistence: The specialist remains active while the user stays within its domain.
- Context Continuity: Conversation history and entities persist across agents through the shared state store.
- Traceable Handoffs: Every routing decision is logged for observability and auditability.
- Low Latency: Responses are faster since domain-appropriate agents handle queries directly.
In practice, this means a user could begin a conversation about billing, continue seamlessly, and only be re-routed when switching topics — without losing any conversational context or history.
Magentic Pattern
The Magentic Pattern is designed for open-ended, multi-faceted tasks that require multiple agents to collaborate. It introduces a Manager (Planner) Agent, which interprets the user’s goal, breaks it into subtasks, and orchestrates multiple Specialist Agents to execute those subtasks.
The Manager creates and maintains a Task Ledger, which tracks the status, dependencies, and results of each specialist’s work. As specialists perform their tool calls or reasoning, the Manager monitors their progress, gathers intermediate outputs, and can dynamically re-plan, dispatch additional tasks, or adjust the overall workflow.
When all subtasks are complete, the Manager synthesizes the combined results into a coherent final response for the user.
Key Communication Principles:
- Centralized Orchestration: The Manager coordinates all agent interactions and workflow logic.
- Parallel and Sequential Execution: Specialists can work simultaneously or in sequence based on task dependencies.
- Task Ledger: Acts as a transparent record of all task assignments, updates, and completions.
- Dynamic Re-planning: The Manager can modify or extend workflows in real time based on intermediate findings.
- Shared Memory: All agents access the same state store for consistent context and result sharing.
- Unified Output: The Manager consolidates results into one response, ensuring coherence across multi-agent reasoning.
In practice, Magentic orchestration enables complex reasoning where the system might combine insights from multiple agents — e.g., billing, product, and security — and present a unified recommendation or resolution to the user.
Choosing the Right Agent for Your Use Case
Selecting the appropriate agent pattern hinges on the complexity of the task and the level of coordination required. As use cases evolve from straightforward queries to intricate, multi-step processes, the need for specialized orchestration increases. Below is a decision matrix to guide your choice:
Feature / Requirement | Single Agent | Reflection Agent | Handoff Pattern | Magentic Orchestration |
---|---|---|---|---|
Handles simple, domain-bound tasks | ✔ | ✔ | ✖ | ✖ |
Supports review / quality assurance | ✖ | ✔ | ✖ | ✔ |
Multi-domain routing | ✖ | ✖ | ✔ | ✔ |
Open-ended / complex workflows | ✖ | ✖ | ✖ | ✔ |
Parallel agent collaboration | ✖ | ✖ | ✖ | ✔ |
Direct tool access | ✔ | ✔ | ✔ | ✔ |
Low latency / fast response | ✔ | ✔ | ✔ | ✖ |
Easy to implement / low orchestration | ✔ | ✔ | ✖ | ✖ |
Dive Deeper: Explore, Build, and Innovate
We've explored various agent patterns, from Single Agent to Magentic Orchestration, each tailored to different use cases and complexities. To see these patterns in action, we invite you to explore our Github repo. Clone the repo, experiment with the examples, and adapt them to your own scenarios.
Additionally, beyond the patterns discussed here, the repository also features a Human-in-the-Loop (HITL) workflow designed for fraud detection. This workflow integrates human oversight into AI decision-making, ensuring higher accuracy and reliability. For an in-depth look at this approach, we recommend reading our detailed blog post: Building Human-in-the-loop AI Workflows with Microsoft Agent Framework | Microsoft Community Hub
Engage with these resources, and start building intelligent, reliable, and scalable AI systems today!
This repository and content is developed and maintained by James Nguyen, Nicole Serafino, Kranthi Kumar Manchikanti, Heena Ugale, and Tim Sullivan.