Let’s start from a real engineering pain point.
In their engineering blog Managed Agents, Anthropic describes a sobering observation: while building the agent scaffolding (the “harness”) for Claude Sonnet 4.5, they noticed the model suffered from “context anxiety,” so they added context-reset logic into the harness. But when the same harness ran on the more capable Claude Opus 4.5, those resets became dead weight — the stronger model no longer needed them, yet the harness was actively holding it back.
This is the fundamental dilemma of the harness: it encodes assumptions about the current model’s capabilities, and those assumptions rot quickly as models evolve.
That’s not a minor concern. In an era when AI capabilities shift qualitatively every few months, any infrastructure tightly coupled to a specific model’s abilities becomes a bottleneck on engineering progress.
Anthropic’s Answer: A Three-Part Decoupled Architecture
Anthropic’s answer borrows from a problem operating systems solved decades ago: how do you provide stable abstractions for programs that haven’t been imagined yet?
The answer is virtualization. Just as the OS virtualizes physical hardware into stable abstractions — processes, files, sockets — Managed Agents virtualizes the Agent runtime into three independent interface layers. For readability this post follows Anthropic’s own metaphors — Brain / Hands / Session — which map to the more conventional engineering terms reasoning orchestrator / execution sandbox / durable event log:
Note: Brain, Hands, and Session are not standard industry terminology — they are metaphors Anthropic uses in its engineering blog. In more conventional engineering vocabulary they correspond, respectively, to the agent reasoning loop / orchestrator, the tool executor / execution sandbox, and the durable event log / state store. The rest of this post uses the two styles interchangeably.
Brain (the reasoning orchestrator): a stateless reasoning loop
This layer is the harness itself — it calls the model and routes tool calls; think of it as the agent’s reasoning orchestrator. Key design point: it must be stateless. All its state comes from the event log; as long as it can call wake(sessionId) to resume, any harness crash is recoverable.
This means the harness can evolve independently as model capabilities evolve, without disturbing in-flight tasks.
Hands (the execution sandbox layer): replaceable execution sandboxes
This layer holds the execution environments the orchestrator calls into — Python REPLs, shells, HTTP clients, even remote containers — i.e. the tool executor / execution sandbox. The contract is brutally simple:
execute(name, input) -> string
Just that one interface. The orchestrator doesn’t care whether a sandbox is a local process or a remote container; if a sandbox crashes, it is treated as an ordinary tool error — the model decides whether to retry on a fresh sandbox. This is the “cattle, not pets” philosophy applied to AI engineering .
Session (the durable event log): externalized memory
This layer is an append-only, durable event log. It is not the model’s context window. This distinction matters enormously. When a task outgrows the context window, the harness can use getEvents(start, end) to slice history on demand, and filter, summarize, or transform it before feeding back to the model — all without changing the underlying interface.
The event log also plays a key role in credential isolation: when execute calls are logged, the Vault redacts first, so raw tokens never enter the log — and never enter the model’s context window.
Performance gains
This decoupling yields measurable wins:
- Median time-to-first-token (TTFT) down ~60%
- P95 latency improved by more than 90%
The reason: the old architecture had to provision a container before inference could begin. After decoupling, inference can start as soon as the event log is readable, with sandboxes provisioned lazily on demand
Microsoft’s Answer: Foundry Agent Service and Hosted Agents
Microsoft gives the enterprise-grade infrastructure answer in Microsoft Foundry.
Foundry Agent Service offers three Agent types:
| Type | Requires Code? | Hosting | Best For |
|---|---|---|---|
| Prompt Agent | No | Fully managed | Rapid prototyping |
| Workflow Agent | No (optional YAML) | Fully managed | Multi-step automation |
| Hosted Agent | Yes | Containerized hosting | Fully custom logic |
This post focuses on Hosted Agents. They let developers package their own Agent code (LangGraph, Microsoft Agent Framework, or fully custom) as a container image and deploy it on Microsoft’s fully managed, pay-per-use infrastructure.
Hosting Adapter: the key abstraction
The core abstraction in Hosted Agents is the Hosting Adapter. It does three things:
- Local testing: starts an HTTP server at localhost:8088; no containerization needed for local runs.
- Protocol translation: automatically converts between Foundry’s Responses API format and Agent Framework’s native data structures.
- Observability: plugs into OpenTelemetry and exports traces, metrics, and logs to Azure Monitor.
Microsoft Agent Framework: the model-agnostic orchestration layer
Microsoft Agent Framework (9.7k , now generally available) is a multi-language, multi-provider Agent orchestration framework that supports:
- Azure OpenAI, OpenAI, GitHub Copilot
- Anthropic Claude
- AWS Bedrock, Ollama
- Protocol standards like A2A, AG-UI, MCP
This matters a lot: Microsoft’s own Agent framework natively supports Anthropic’s Claude models, providing an official path for cross-ecosystem integration.
This Project: Two Philosophies Shake Hands in Code
Now let’s see how this real project fuses the two architectural philosophies.
Project layout
HostedAgentDemo/
├── main.py # reasoning orchestrator (a.k.a. Brain): main agent loop
├── agent.yaml # Hosted Agent declaration
├── azure.yaml # azd deployment config
├── Dockerfile # containerization
├── harness/
│ ├── session.py # durable event log (a.k.a. Session)
│ ├── sandbox.py # execution sandbox pool (a.k.a. Hands)
│ └── vault.py # credential vault
└── requirements.txt
Reasoning orchestrator (a.k.a. Brain): FoundryChatClient + Agent Framework
# main.py (excerpt)
from agent_framework import Agent
from agent_framework.foundry import FoundryChatClient
from agent_framework_foundry_hosting import ResponsesHostServer
async with DefaultAzureCredential() as credential:
client = FoundryChatClient(
project_endpoint=PROJECT_ENDPOINT,
model=MODEL_DEPLOYMENT_NAME,
credential=credential,
allow_preview=True,
)
agent = Agent(
client,
instructions=INSTRUCTIONS,
name="ManagedStyleAgent",
tools=[execute, list_tools, get_events, emit_note],
)
server = ResponsesHostServer(agent)
await server.run_async()
What’s happening here?
- FoundryChatClient: the Foundry model client from Microsoft Agent Framework; talks to a model deployed on Microsoft Foundry.
- Agent: the stateless reasoning orchestrator (what Anthropic calls the “Brain”), with a fixed toolset of four: execute, list_tools, get_events, emit_note.
- ResponsesHostServer: the Hosting Adapter; exposes the Agent as an HTTP service compatible with Foundry’s Responses API.
The orchestrator’s toolset follows Anthropic Managed Agents’ minimalism strictly — every capability funnels through the single gateway execute(name, input_json); the reasoning layer knows nothing about concrete sandbox implementations.
Execution sandbox layer (a.k.a. Hands): a cattle-style sandbox pool
# harness/sandbox.py (core logic)
def execute(self, name: str, input: dict[str, Any]) -> str:
"""The one and only contract between the orchestrator and the sandboxes."""
if name not in self._tools:
return f"ERROR: unknown tool '{name}'. Available: {self.list_tools()}"
sandbox_id = self.provision(kind=name)
try:
out = self._tools[name](input or {}, self._vault)
out = self._vault.redact(out) # redact credentials
return out
except Exception as e:
return f"ERROR: sandbox '{sandbox_id}' failed: {type(e).__name__}: {e}"
finally:
self.retire(sandbox_id) # forcibly destroy after every call
Look at the finally block: every execute call destroys its sandbox afterwards, success or failure. That guarantees sandboxes are genuinely stateless units — leftover processes, temp files, in-memory state all vanish with the sandbox.
Built-in sandboxes include:
- python_exec: isolated Python subprocess (15s timeout, no leaked env vars)
- shell_exec: argv-list execution (no shell metacharacter injection)
- http_fetch: auth headers injected via the Vault proxy
Durable event log (a.k.a. Session): externalized memory
# harness/session.py (core interface)
class SessionStore:
def emit_event(self, session_id, type, payload) -> SessionEvent:
"""Append-only — never overwritten, never deleted."""
def get_events(self, session_id, start=0, end=None) -> list[SessionEvent]:
"""Positional slice. The harness can transform before passing to the model."""
def wake(self, session_id) -> list[SessionEvent]:
"""Recovery entry point after harness crash."""
The event log is a .jsonl append-only file — one JSON event per line. In production you can drop in Azure Cosmos DB, Event Hub, or any durable store; the interface doesn’t change.
Vault: credentials never touch the model
# harness/vault.py
class CredentialVault:
def build_auth_headers(self, logical_name: str) -> dict[str, str]:
token = self.resolve(logical_name)
return {"Authorization": f"Bearer {token}"}
def redact(self, value: Any) -> Any:
"""Replace every known secret in logs and tool return values."""
s = str(value)
for secret in self._secrets.values():
if secret and secret in s:
s = s.replace(secret, "***REDACTED***")
return s
The model references credentials by logical name: execute("http_fetch", {"url": "...", "credential": "github"}) — it only knows the logical name "github". The real token is injected by the Vault inside the sandbox, and tool return values are redacted before being written to the event log.
Deploying: From Local to Azure in One Command
# 1. Install the azd Agent extension
azd ext install azure.ai.agents
# 2. Test locally (no container needed)
python main.py
# → Managed-style Agent running on http://localhost:8088
# 3. Deploy to Azure (Provision + Build + Deploy)
azd up
Architectures Compared: Two Ecosystems in Philosophical Resonance
| Dimension | Anthropic Managed Agents | Microsoft Foundry Hosted Agents |
|---|---|---|
| Core abstraction | Reasoning orchestrator / execution sandbox / durable event log (Anthropic: Brain / Hands / Session) | Hosting Adapter + Agent Framework |
| Sandbox strategy | Cattle (destroyed after use) | Container (managed lifecycle) |
| Credential security | Vault proxy injection, invisible to the model | Managed Identity + RBAC |
| Context management | External event log, sliced on demand | Responses API session management |
| Observability | Event log + custom | OpenTelemetry → Azure Monitor |
| Scaling | Many orchestrators × many sandboxes, concurrent | minReplicas / maxReplicas |
| Cross-model support | Claude model family | Many providers (Claude included) |
The core philosophy of both architectures aligns tightly: decouple reasoning (the orchestrator), tool execution (the sandbox layer), and memory (the event log) so each layer can evolve independently. The difference is emphasis:
- Anthropic prioritizes interface stability — so today’s infrastructure can run tomorrow’s stronger models.
- Microsoft prioritizes enterprise-grade operations — so agents get production-grade security, scaling, and observability.
That’s the value of this project: it proves the two philosophies can live together in one codebase.
Running it
# Clone and configure
git clone <your-repo>
cd HostedAgentDemo
cp .env.example .env
# Edit .env with FOUNDRY_PROJECT_ENDPOINT and MODEL_DEPLOYMENT_NAME
# Install dependencies
pip install -r requirements.txt
# Run locally
python main.py
# Test a conversation
curl http://localhost:8088/responses \
-H "Content-Type: application/json" \
-d '{"input": [{"role": "user", "content": "List the tools you can use"}]}'
# Deploy to Azure
azd ext install azure.ai.agents
azd up
Summary
Back in 2016 the industry was still arguing whether microservices were over-engineering. Today nobody doubts the value of service decoupling at scale.
I believe 2025–2026 is the “microservices moment” for Agent engineering — people are starting to realize that an Agent that couples reasoning, tool execution, and state memory inside a single monolithic container simply cannot keep pace with model evolution.
Anthropic’s Managed Agents supplies the architectural philosophy; Microsoft’s Foundry Hosted Agents supplies the enterprise infrastructure; and this open-source project shows that they are not an either/or choice — they are complementary, and they make each other better.
References
- Sample Code https://github.com/microsoft/Agent-Framework-Samples/tree/main/09.Cases/maf_harness_managed_hosted_agent
- Anthropic Engineering Blog. Managed Agents https://www.anthropic.com/engineering/managed-agents
- Microsoft Learn – Hosted agents in Foundry Agent Service (preview) https://learn.microsoft.com/en-us/azure/foundry/agents/concepts/hosted-agents?view=foundry
- Microsoft Foundry Samples. agent-framework Python hosted-agent samples https://github.com/microsoft-foundry/foundry-samples/tree/main/samples/python/hosted-agents/agent-framework
- Microsoft Agent Framework https://github.com/microsoft/agent-framework
- Microsoft Agent Framework Sample https://github.com/microsoft/agent-framework-samples
- Microsoft Learn – What is Microsoft Foundry Agent Service https://learn.microsoft.com/en-us/azure/foundry/agents/overview