Blog Post

Educator Developer Blog
6 MIN READ

Langchain Multi-Agent Systems with Microsoft Agent Framework and Hosted Agents

Lee_Stott's avatar
Lee_Stott
Icon for Microsoft rankMicrosoft
Mar 26, 2026

If you have been building AI agents with LangChain, you already know how powerful its tool and chain abstractions are. But when it comes to deploying those agents to production — with real infrastructure, managed identity, live web search, and container orchestration — you need something more.

This post walks through how to combine LangChain with the Microsoft Agent Framework (azure-ai-agents) and deploy the result as a Microsoft Foundry Hosted Agent. We will build a multi-agent incident triage copilot that uses LangChain locally and seamlessly upgrades to cloud-hosted capabilities on Microsoft Foundry.

Why combine LangChain with Microsoft Agent Framework?

As a LangChain developer, you get excellent abstractions for building agents: the @tool decorator, RunnableLambda chains, and composable pipelines. But production deployment raises questions that LangChain alone does not answer:

  • Where do your agents run? Containers, serverless, or managed infrastructure?
  • How do you add live web search or code execution? Bing Grounding and Code Interpreter are not LangChain built-ins.
  • How do you handle authentication? Managed identity, API keys, or tokens?
  • How do you observe agents in production? Distributed tracing across multiple agents?

The Microsoft Agent Framework fills these gaps. It provides AgentsClient for creating and managing agents on Microsoft Foundry, built-in tools like BingGroundingTool and CodeInterpreterTool, and a thread-based conversation model. Combined with Hosted Agents, you get a fully managed container runtime with health probes, auto-scaling, and the OpenAI Responses API protocol.

The key insight: LangChain handles local logic and chain composition; the Microsoft Agent Framework handles cloud-hosted orchestration and tooling.

Architecture overview

The incident triage copilot uses a coordinator pattern with three specialist agents:

UI Homepage showing Foundry connected status

User Query
    |
    v
Coordinator Agent
    |
    +--> LangChain Triage Chain    (routing decision)
    +--> LangChain Synthesis Chain  (combine results)
    |
    +---+---+---+
    |   |       |
    v   v       v
Research  Diagnostics  Remediation
 Agent      Agent        Agent

Each specialist agent has two execution modes:

ModeLangChain RoleMicrosoft Agent Framework Role
Local@tool functions provide heuristic analysisNot used
FoundryChains handle routing and synthesisAgentsClient with BingGroundingTool, CodeInterpreterTool

This dual-mode design means you can develop and test locally with zero cloud dependencies, then deploy to Foundry for production capabilities.

Step 1: Define your LangChain tools

Start with what you know. Define typed, documented tools using LangChain’s @tool decorator:

 
from langchain_core.tools import tool

@tool
def classify_incident_severity(query: str) -> str:
    """Classify the severity and priority of an incident based on keywords.

    Args:
        query: The incident description text.

    Returns:
        Severity classification with priority level.
    """
    query_lower = query.lower()

    critical_keywords = [
        "production down", "all users", "outage", "breach",
    ]
    high_keywords = [
        "503", "500", "timeout", "latency", "slow",
    ]

    if any(kw in query_lower for kw in critical_keywords):
        return "severity=critical, priority=P1"
    if any(kw in query_lower for kw in high_keywords):
        return "severity=high, priority=P2"
    return "severity=low, priority=P4"
 
 

These tools work identically in local mode and serve as fallbacks when Foundry is unavailable.

Step 2: Build routing with LangChain chains

Use RunnableLambda to create a routing chain that classifies the incident and selects which specialists to invoke:

 
from langchain_core.runnables import RunnableLambda
from enum import Enum

class AgentRole(str, Enum):
    RESEARCH = "research"
    DIAGNOSTICS = "diagnostics"
    REMEDIATION = "remediation"

DIAGNOSTICS_KEYWORDS = {
    "log", "error", "exception", "timeout", "500", "503",
    "crash", "oom", "root cause",
}

REMEDIATION_KEYWORDS = {
    "fix", "remediate", "runbook", "rollback", "hotfix",
    "patch", "resolve", "action plan",
}

def _route(inputs: dict) -> dict:
    query = inputs["query"].lower()
    specialists = [AgentRole.RESEARCH]  # always included

    if any(kw in query for kw in DIAGNOSTICS_KEYWORDS):
        specialists.append(AgentRole.DIAGNOSTICS)

    if any(kw in query for kw in REMEDIATION_KEYWORDS):
        specialists.append(AgentRole.REMEDIATION)

    return {**inputs, "specialists": specialists}

triage_routing_chain = RunnableLambda(_route)
 
 

This is pure LangChain — no cloud dependency. The chain analyses the query and returns which specialists should handle it.

Step 3: Create specialist agents with dual-mode execution

Each specialist agent extends a base class. In local mode, it uses LangChain tools. In Foundry mode, it delegates to the Microsoft Agent Framework:

 
from abc import ABC, abstractmethod
from pathlib import Path

class BaseSpecialistAgent(ABC):
    role: AgentRole
    prompt_file: str

    def __init__(self):
        prompt_path = Path(__file__).parent.parent / "prompts" / self.prompt_file
        self.system_prompt = prompt_path.read_text(encoding="utf-8")

    async def run(self, query, shared_context, correlation_id, client=None):
        if client is not None:
            return await self._run_on_foundry(query, shared_context, correlation_id, client)
        return await self._run_locally(query, shared_context, correlation_id)

    async def _run_on_foundry(self, query, shared_context, correlation_id, client):
        """Use Microsoft Agent Framework for cloud-hosted execution."""
        from azure.ai.agents.models import BingGroundingTool

        agent = await client.agents.create_agent(
            model=shared_context.get("model_deployment", "gpt-4o"),
            name=f"{self.role.value}-{correlation_id}",
            instructions=self.system_prompt,
            tools=self._get_foundry_tools(shared_context),
        )

        thread = await client.agents.threads.create()
        await client.agents.messages.create(
            thread_id=thread.id,
            role="user",
            content=self._build_prompt(query, shared_context),
        )

        run = await client.agents.runs.create_and_process(
            thread_id=thread.id,
            agent_id=agent.id,
        )
        # Extract and return the agent’s response...

    async def _run_locally(self, query, shared_context, correlation_id):
        """Use LangChain tools for local heuristic analysis."""
        # Each subclass implements this with its specific tools
        ...
 
 

The key pattern here: same interface, different backends. Your coordinator does not care whether a specialist ran locally or on Foundry.

Step 4: Wire it up with FastAPI

Expose the multi-agent pipeline through a FastAPI endpoint. The /triage endpoint accepts incident descriptions and returns structured reports:

 
from fastapi import FastAPI
from agents.coordinator import Coordinator
from models import TriageRequest

app = FastAPI(title="Incident Triage Copilot")
coordinator = Coordinator()

@app.post("/triage")
async def triage(request: TriageRequest):
    return await coordinator.triage(
        request=request,
        client=app.state.foundry_client,
        max_turns=10,
    )
 

The application also implements the /responses endpoint, which follows the OpenAI Responses API protocol. This is what Microsoft Foundry Hosted Agents expects when routing traffic to your container.

Step 5: Deploy as a Hosted Agent

This is where Microsoft Foundry Hosted Agents shines. Your multi-agent system becomes a managed, auto-scaling service with a single command:

 
# Install the azd AI agent extension
azd extension install azure.ai.agents

# Provision infrastructure and deploy
azd up
 
 

Triage pipeline running with Research, Diagnostics, and Remediation agents

The Azure Developer CLI (azd) provisions everything:

  • Azure Container Registry for your Docker image
  • Container App with health probes and auto-scaling
  • User-Assigned Managed Identity for secure authentication
  • Microsoft Foundry Hub and Project with model deployments
  • Application Insights for distributed tracing

Your agent.yaml defines what tools the hosted agent has access to:

 
name: incident-triage-copilot-langchain
kind: hosted
model:
  deployment: gpt-4o
identity:
  type: managed
tools:
  - type: bing_grounding
    enabled: true
  - type: code_interpreter
    enabled: true
 
 

What you gain over pure LangChain

Triage report showing coordinator summary and specialist results

CapabilityLangChain OnlyLangChain + Microsoft Agent Framework
Local developmentYesYes (identical experience)
Live web searchRequires custom integrationBuilt-in BingGroundingTool
Code executionRequires sandboxingBuilt-in CodeInterpreterTool
Managed hostingDIY containersFoundry Hosted Agents
AuthenticationDIYManaged Identity (zero secrets)
ObservabilityDIYOpenTelemetry + Application Insights
One-command deployNoazd up

Testing locally

The dual-mode architecture means you can test the full pipeline without any cloud resources:

Research Agent with Bing Grounding and Diagnostics Agent with Code Interpreter

 
# Create virtual environment and install dependencies
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

# Run locally (agents use LangChain tools)
python -m src
 

 

 

Then open http://localhost:8080 in your browser to use the built-in web UI, or call the API directly:

 
curl -X POST http://localhost:8080/triage \
  -H "Content-Type: application/json" \
  -d '{"message": "Getting 503 errors on /api/orders since 2pm"}'
 

The response includes a coordinator summary, specialist results with confidence scores, and the tools each agent used.

Running the tests

The project includes a comprehensive test suite covering routing logic, tool behaviour, agent execution, and HTTP endpoints:

curl -X POST http://localhost:8080/triage \
  -H "Content-Type: application/json" \
  -d '{"message": "Getting 503 errors on /api/orders since 2pm"}'

Tests run entirely in local mode, so no cloud credentials are needed.

Key takeaways for LangChain developers

  1. Keep your LangChain abstractions. The @tool decorator, RunnableLambda chains, and composable pipelines all work exactly as you expect.
  2. Add cloud capabilities incrementally. Start local, then enable Bing Grounding, Code Interpreter, and managed hosting when you are ready.
  3. Use the dual-mode pattern. Every agent should work locally with LangChain tools and on Foundry with the Microsoft Agent Framework. This makes development fast and deployment seamless.
  4. Let azd handle infrastructure. One command provisions everything: containers, identity, monitoring, and model deployments.
  5. Security comes free. Managed Identity means no API keys in your code. Non-root containers, RBAC, and disabled ACR admin are all configured by default.

Get started

Clone the sample repository and try it yourself:

 
git clone https://github.com/leestott/hosted-agents-langchain-samples
cd hosted-agents-langchain-samples
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
python -m src
 
 

Open http://localhost:8080 to interact with the copilot through the web UI. When you are ready for production, run azd up and your multi-agent system is live on Microsoft Foundry.

Resources

Published Mar 26, 2026
Version 1.0
No CommentsBe the first to comment