If you have been building AI agents with LangChain, you already know how powerful its tool and chain abstractions are. But when it comes to deploying those agents to production — with real infrastructure, managed identity, live web search, and container orchestration — you need something more.
This post walks through how to combine LangChain with the Microsoft Agent Framework (azure-ai-agents) and deploy the result as a Microsoft Foundry Hosted Agent. We will build a multi-agent incident triage copilot that uses LangChain locally and seamlessly upgrades to cloud-hosted capabilities on Microsoft Foundry.
Why combine LangChain with Microsoft Agent Framework?
As a LangChain developer, you get excellent abstractions for building agents: the @tool decorator, RunnableLambda chains, and composable pipelines. But production deployment raises questions that LangChain alone does not answer:
- Where do your agents run? Containers, serverless, or managed infrastructure?
- How do you add live web search or code execution? Bing Grounding and Code Interpreter are not LangChain built-ins.
- How do you handle authentication? Managed identity, API keys, or tokens?
- How do you observe agents in production? Distributed tracing across multiple agents?
The Microsoft Agent Framework fills these gaps. It provides AgentsClient for creating and managing agents on Microsoft Foundry, built-in tools like BingGroundingTool and CodeInterpreterTool, and a thread-based conversation model. Combined with Hosted Agents, you get a fully managed container runtime with health probes, auto-scaling, and the OpenAI Responses API protocol.
The key insight: LangChain handles local logic and chain composition; the Microsoft Agent Framework handles cloud-hosted orchestration and tooling.
Architecture overview
The incident triage copilot uses a coordinator pattern with three specialist agents:

User Query
|
v
Coordinator Agent
|
+--> LangChain Triage Chain (routing decision)
+--> LangChain Synthesis Chain (combine results)
|
+---+---+---+
| | |
v v v
Research Diagnostics Remediation
Agent Agent Agent
Each specialist agent has two execution modes:
| Mode | LangChain Role | Microsoft Agent Framework Role |
|---|---|---|
| Local | @tool functions provide heuristic analysis | Not used |
| Foundry | Chains handle routing and synthesis | AgentsClient with BingGroundingTool, CodeInterpreterTool |
This dual-mode design means you can develop and test locally with zero cloud dependencies, then deploy to Foundry for production capabilities.
Step 1: Define your LangChain tools
Start with what you know. Define typed, documented tools using LangChain’s @tool decorator:
from langchain_core.tools import tool
@tool
def classify_incident_severity(query: str) -> str:
"""Classify the severity and priority of an incident based on keywords.
Args:
query: The incident description text.
Returns:
Severity classification with priority level.
"""
query_lower = query.lower()
critical_keywords = [
"production down", "all users", "outage", "breach",
]
high_keywords = [
"503", "500", "timeout", "latency", "slow",
]
if any(kw in query_lower for kw in critical_keywords):
return "severity=critical, priority=P1"
if any(kw in query_lower for kw in high_keywords):
return "severity=high, priority=P2"
return "severity=low, priority=P4"
These tools work identically in local mode and serve as fallbacks when Foundry is unavailable.
Step 2: Build routing with LangChain chains
Use RunnableLambda to create a routing chain that classifies the incident and selects which specialists to invoke:
from langchain_core.runnables import RunnableLambda
from enum import Enum
class AgentRole(str, Enum):
RESEARCH = "research"
DIAGNOSTICS = "diagnostics"
REMEDIATION = "remediation"
DIAGNOSTICS_KEYWORDS = {
"log", "error", "exception", "timeout", "500", "503",
"crash", "oom", "root cause",
}
REMEDIATION_KEYWORDS = {
"fix", "remediate", "runbook", "rollback", "hotfix",
"patch", "resolve", "action plan",
}
def _route(inputs: dict) -> dict:
query = inputs["query"].lower()
specialists = [AgentRole.RESEARCH] # always included
if any(kw in query for kw in DIAGNOSTICS_KEYWORDS):
specialists.append(AgentRole.DIAGNOSTICS)
if any(kw in query for kw in REMEDIATION_KEYWORDS):
specialists.append(AgentRole.REMEDIATION)
return {**inputs, "specialists": specialists}
triage_routing_chain = RunnableLambda(_route)
This is pure LangChain — no cloud dependency. The chain analyses the query and returns which specialists should handle it.
Step 3: Create specialist agents with dual-mode execution
Each specialist agent extends a base class. In local mode, it uses LangChain tools. In Foundry mode, it delegates to the Microsoft Agent Framework:
from abc import ABC, abstractmethod
from pathlib import Path
class BaseSpecialistAgent(ABC):
role: AgentRole
prompt_file: str
def __init__(self):
prompt_path = Path(__file__).parent.parent / "prompts" / self.prompt_file
self.system_prompt = prompt_path.read_text(encoding="utf-8")
async def run(self, query, shared_context, correlation_id, client=None):
if client is not None:
return await self._run_on_foundry(query, shared_context, correlation_id, client)
return await self._run_locally(query, shared_context, correlation_id)
async def _run_on_foundry(self, query, shared_context, correlation_id, client):
"""Use Microsoft Agent Framework for cloud-hosted execution."""
from azure.ai.agents.models import BingGroundingTool
agent = await client.agents.create_agent(
model=shared_context.get("model_deployment", "gpt-4o"),
name=f"{self.role.value}-{correlation_id}",
instructions=self.system_prompt,
tools=self._get_foundry_tools(shared_context),
)
thread = await client.agents.threads.create()
await client.agents.messages.create(
thread_id=thread.id,
role="user",
content=self._build_prompt(query, shared_context),
)
run = await client.agents.runs.create_and_process(
thread_id=thread.id,
agent_id=agent.id,
)
# Extract and return the agent’s response...
async def _run_locally(self, query, shared_context, correlation_id):
"""Use LangChain tools for local heuristic analysis."""
# Each subclass implements this with its specific tools
...
The key pattern here: same interface, different backends. Your coordinator does not care whether a specialist ran locally or on Foundry.
Step 4: Wire it up with FastAPI
Expose the multi-agent pipeline through a FastAPI endpoint. The /triage endpoint accepts incident descriptions and returns structured reports:
from fastapi import FastAPI
from agents.coordinator import Coordinator
from models import TriageRequest
app = FastAPI(title="Incident Triage Copilot")
coordinator = Coordinator()
@app.post("/triage")
async def triage(request: TriageRequest):
return await coordinator.triage(
request=request,
client=app.state.foundry_client,
max_turns=10,
)
The application also implements the /responses endpoint, which follows the OpenAI Responses API protocol. This is what Microsoft Foundry Hosted Agents expects when routing traffic to your container.
Step 5: Deploy as a Hosted Agent
This is where Microsoft Foundry Hosted Agents shines. Your multi-agent system becomes a managed, auto-scaling service with a single command:
# Install the azd AI agent extension
azd extension install azure.ai.agents
# Provision infrastructure and deploy
azd up

The Azure Developer CLI (azd) provisions everything:
- Azure Container Registry for your Docker image
- Container App with health probes and auto-scaling
- User-Assigned Managed Identity for secure authentication
- Microsoft Foundry Hub and Project with model deployments
- Application Insights for distributed tracing
Your agent.yaml defines what tools the hosted agent has access to:
name: incident-triage-copilot-langchain
kind: hosted
model:
deployment: gpt-4o
identity:
type: managed
tools:
- type: bing_grounding
enabled: true
- type: code_interpreter
enabled: true
What you gain over pure LangChain

| Capability | LangChain Only | LangChain + Microsoft Agent Framework |
|---|---|---|
| Local development | Yes | Yes (identical experience) |
| Live web search | Requires custom integration | Built-in BingGroundingTool |
| Code execution | Requires sandboxing | Built-in CodeInterpreterTool |
| Managed hosting | DIY containers | Foundry Hosted Agents |
| Authentication | DIY | Managed Identity (zero secrets) |
| Observability | DIY | OpenTelemetry + Application Insights |
| One-command deploy | No | azd up |
Testing locally
The dual-mode architecture means you can test the full pipeline without any cloud resources:

# Create virtual environment and install dependencies
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
# Run locally (agents use LangChain tools)
python -m src
Then open http://localhost:8080 in your browser to use the built-in web UI, or call the API directly:
curl -X POST http://localhost:8080/triage \
-H "Content-Type: application/json" \
-d '{"message": "Getting 503 errors on /api/orders since 2pm"}'
The response includes a coordinator summary, specialist results with confidence scores, and the tools each agent used.
Running the tests
The project includes a comprehensive test suite covering routing logic, tool behaviour, agent execution, and HTTP endpoints:
curl -X POST http://localhost:8080/triage \
-H "Content-Type: application/json" \
-d '{"message": "Getting 503 errors on /api/orders since 2pm"}'
Tests run entirely in local mode, so no cloud credentials are needed.
Key takeaways for LangChain developers
- Keep your LangChain abstractions. The
@tooldecorator,RunnableLambdachains, and composable pipelines all work exactly as you expect. - Add cloud capabilities incrementally. Start local, then enable Bing Grounding, Code Interpreter, and managed hosting when you are ready.
- Use the dual-mode pattern. Every agent should work locally with LangChain tools and on Foundry with the Microsoft Agent Framework. This makes development fast and deployment seamless.
- Let
azdhandle infrastructure. One command provisions everything: containers, identity, monitoring, and model deployments. - Security comes free. Managed Identity means no API keys in your code. Non-root containers, RBAC, and disabled ACR admin are all configured by default.
Get started
Clone the sample repository and try it yourself:
git clone https://github.com/leestott/hosted-agents-langchain-samples
cd hosted-agents-langchain-samples
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
python -m src
Open http://localhost:8080 to interact with the copilot through the web UI. When you are ready for production, run azd up and your multi-agent system is live on Microsoft Foundry.