Blog Post

Microsoft Developer Community Blog
8 MIN READ

When Anthropic’s Managed Agents Meet Microsoft Hosted Agents

kinfey's avatar
kinfey
Icon for Microsoft rankMicrosoft
Apr 24, 2026

Let’s start from a real engineering pain point.

In their engineering blog Managed Agents, Anthropic describes a sobering observation: while building the agent scaffolding (the “harness”) for Claude Sonnet 4.5, they noticed the model suffered from “context anxiety,” so they added context-reset logic into the harness. But when the same harness ran on the more capable Claude Opus 4.5, those resets became dead weight — the stronger model no longer needed them, yet the harness was actively holding it back.

This is the fundamental dilemma of the harness: it encodes assumptions about the current model’s capabilities, and those assumptions rot quickly as models evolve.

That’s not a minor concern. In an era when AI capabilities shift qualitatively every few months, any infrastructure tightly coupled to a specific model’s abilities becomes a bottleneck on engineering progress.

Anthropic’s Answer: A Three-Part Decoupled Architecture

Anthropic’s answer borrows from a problem operating systems solved decades ago: how do you provide stable abstractions for programs that haven’t been imagined yet?

The answer is virtualization. Just as the OS virtualizes physical hardware into stable abstractions — processes, files, sockets — Managed Agents virtualizes the Agent runtime into three independent interface layers. For readability this post follows Anthropic’s own metaphors — Brain / Hands / Session — which map to the more conventional engineering terms reasoning orchestrator / execution sandbox / durable event log:

Note: BrainHands, and Session are not standard industry terminology — they are metaphors Anthropic uses in its engineering blog. In more conventional engineering vocabulary they correspond, respectively, to the agent reasoning loop / orchestrator, the tool executor / execution sandbox, and the durable event log / state store. The rest of this post uses the two styles interchangeably.

Brain (the reasoning orchestrator): a stateless reasoning loop

This layer is the harness itself — it calls the model and routes tool calls; think of it as the agent’s reasoning orchestratorKey design point: it must be stateless. All its state comes from the event log; as long as it can call wake(sessionId) to resume, any harness crash is recoverable.

This means the harness can evolve independently as model capabilities evolve, without disturbing in-flight tasks.

Hands (the execution sandbox layer): replaceable execution sandboxes

This layer holds the execution environments the orchestrator calls into — Python REPLs, shells, HTTP clients, even remote containers — i.e. the tool executor / execution sandboxThe contract is brutally simple:

execute(name, input) -> string

Just that one interface. The orchestrator doesn’t care whether a sandbox is a local process or a remote container; if a sandbox crashes, it is treated as an ordinary tool error — the model decides whether to retry on a fresh sandbox. This is the “cattle, not pets” philosophy applied to AI engineering .

Session (the durable event log): externalized memory

This layer is an append-only, durable event logIt is not the model’s context window. This distinction matters enormously. When a task outgrows the context window, the harness can use getEvents(start, end) to slice history on demand, and filter, summarize, or transform it before feeding back to the model — all without changing the underlying interface.

The event log also plays a key role in credential isolation: when execute calls are logged, the Vault redacts first, so raw tokens never enter the log — and never enter the model’s context window.

Performance gains

This decoupling yields measurable wins:

  • Median time-to-first-token (TTFT) down ~60%
  • P95 latency improved by more than 90%

The reason: the old architecture had to provision a container before inference could begin. After decoupling, inference can start as soon as the event log is readable, with sandboxes provisioned lazily on demand

Microsoft’s Answer: Foundry Agent Service and Hosted Agents

Microsoft gives the enterprise-grade infrastructure answer in Microsoft Foundry.

Foundry Agent Service offers three Agent types:

TypeRequires Code?HostingBest For
Prompt AgentNoFully managedRapid prototyping
Workflow AgentNo (optional YAML)Fully managedMulti-step automation
Hosted AgentYesContainerized hostingFully custom logic

This post focuses on Hosted Agents. They let developers package their own Agent code (LangGraph, Microsoft Agent Framework, or fully custom) as a container image and deploy it on Microsoft’s fully managed, pay-per-use infrastructure.

Hosting Adapter: the key abstraction

The core abstraction in Hosted Agents is the Hosting Adapter. It does three things:

  1. Local testing: starts an HTTP server at localhost:8088; no containerization needed for local runs.
  2. Protocol translation: automatically converts between Foundry’s Responses API format and Agent Framework’s native data structures.
  3. Observability: plugs into OpenTelemetry and exports traces, metrics, and logs to Azure Monitor.

Microsoft Agent Framework: the model-agnostic orchestration layer

Microsoft Agent Framework (9.7k , now generally available) is a multi-language, multi-provider Agent orchestration framework that supports:

  • Azure OpenAI, OpenAI, GitHub Copilot
  • Anthropic Claude
  • AWS Bedrock, Ollama
  • Protocol standards like A2A, AG-UI, MCP

This matters a lot: Microsoft’s own Agent framework natively supports Anthropic’s Claude models, providing an official path for cross-ecosystem integration.

This Project: Two Philosophies Shake Hands in Code

Now let’s see how this real project fuses the two architectural philosophies.

Project layout

HostedAgentDemo/
├── main.py              # reasoning orchestrator (a.k.a. Brain): main agent loop
├── agent.yaml           # Hosted Agent declaration
├── azure.yaml           # azd deployment config
├── Dockerfile           # containerization
├── harness/
│   ├── session.py       # durable event log (a.k.a. Session)
│   ├── sandbox.py       # execution sandbox pool (a.k.a. Hands)
│   └── vault.py         # credential vault
└── requirements.txt

Reasoning orchestrator (a.k.a. Brain): FoundryChatClient + Agent Framework

# main.py (excerpt)
from agent_framework import Agent
from agent_framework.foundry import FoundryChatClient
from agent_framework_foundry_hosting import ResponsesHostServer

async with DefaultAzureCredential() as credential:
    client = FoundryChatClient(
        project_endpoint=PROJECT_ENDPOINT,
        model=MODEL_DEPLOYMENT_NAME,
        credential=credential,
        allow_preview=True,
    )
    agent = Agent(
        client,
        instructions=INSTRUCTIONS,
        name="ManagedStyleAgent",
        tools=[execute, list_tools, get_events, emit_note],
    )
    server = ResponsesHostServer(agent)
    await server.run_async()

What’s happening here?

  • FoundryChatClient: the Foundry model client from Microsoft Agent Framework; talks to a model deployed on Microsoft Foundry.
  • Agent: the stateless reasoning orchestrator (what Anthropic calls the “Brain”), with a fixed toolset of four: execute, list_tools, get_events, emit_note.
  • ResponsesHostServer: the Hosting Adapter; exposes the Agent as an HTTP service compatible with Foundry’s Responses API.

The orchestrator’s toolset follows Anthropic Managed Agents’ minimalism strictly — every capability funnels through the single gateway execute(name, input_json); the reasoning layer knows nothing about concrete sandbox implementations.

Execution sandbox layer (a.k.a. Hands): a cattle-style sandbox pool

# harness/sandbox.py (core logic)
def execute(self, name: str, input: dict[str, Any]) -> str:
    """The one and only contract between the orchestrator and the sandboxes."""
    if name not in self._tools:
        return f"ERROR: unknown tool '{name}'. Available: {self.list_tools()}"

    sandbox_id = self.provision(kind=name)
    try:
        out = self._tools[name](input or {}, self._vault)
        out = self._vault.redact(out)  # redact credentials
        return out
    except Exception as e:
        return f"ERROR: sandbox '{sandbox_id}' failed: {type(e).__name__}: {e}"
    finally:
        self.retire(sandbox_id)  # forcibly destroy after every call

Look at the finally block: every execute call destroys its sandbox afterwards, success or failure. That guarantees sandboxes are genuinely stateless units — leftover processes, temp files, in-memory state all vanish with the sandbox.

Built-in sandboxes include:

  • python_exec: isolated Python subprocess (15s timeout, no leaked env vars)
  • shell_exec: argv-list execution (no shell metacharacter injection)
  • http_fetch: auth headers injected via the Vault proxy

Durable event log (a.k.a. Session): externalized memory

# harness/session.py (core interface)
class SessionStore:
    def emit_event(self, session_id, type, payload) -> SessionEvent:
        """Append-only — never overwritten, never deleted."""

    def get_events(self, session_id, start=0, end=None) -> list[SessionEvent]:
        """Positional slice. The harness can transform before passing to the model."""

    def wake(self, session_id) -> list[SessionEvent]:
        """Recovery entry point after harness crash."""

The event log is a .jsonl append-only file — one JSON event per line. In production you can drop in Azure Cosmos DB, Event Hub, or any durable store; the interface doesn’t change.

Vault: credentials never touch the model

# harness/vault.py
class CredentialVault:
    def build_auth_headers(self, logical_name: str) -> dict[str, str]:
        token = self.resolve(logical_name)
        return {"Authorization": f"Bearer {token}"}

    def redact(self, value: Any) -> Any:
        """Replace every known secret in logs and tool return values."""
        s = str(value)
        for secret in self._secrets.values():
            if secret and secret in s:
                s = s.replace(secret, "***REDACTED***")
        return s

The model references credentials by logical name: execute("http_fetch", {"url": "...", "credential": "github"}) — it only knows the logical name "github". The real token is injected by the Vault inside the sandbox, and tool return values are redacted before being written to the event log.

Deploying: From Local to Azure in One Command

# 1. Install the azd Agent extension
azd ext install azure.ai.agents

# 2. Test locally (no container needed)
python main.py
# → Managed-style Agent running on http://localhost:8088

# 3. Deploy to Azure (Provision + Build + Deploy)
azd up

Architectures Compared: Two Ecosystems in Philosophical Resonance

DimensionAnthropic Managed AgentsMicrosoft Foundry Hosted Agents
Core abstractionReasoning orchestrator / execution sandbox / durable event log (Anthropic: Brain / Hands / Session)Hosting Adapter + Agent Framework
Sandbox strategyCattle (destroyed after use)Container (managed lifecycle)
Credential securityVault proxy injection, invisible to the modelManaged Identity + RBAC
Context managementExternal event log, sliced on demandResponses API session management
ObservabilityEvent log + customOpenTelemetry → Azure Monitor
ScalingMany orchestrators × many sandboxes, concurrentminReplicas / maxReplicas
Cross-model supportClaude model familyMany providers (Claude included)

The core philosophy of both architectures aligns tightly: decouple reasoning (the orchestrator), tool execution (the sandbox layer), and memory (the event log) so each layer can evolve independently. The difference is emphasis:

  • Anthropic prioritizes interface stability — so today’s infrastructure can run tomorrow’s stronger models.
  • Microsoft prioritizes enterprise-grade operations — so agents get production-grade security, scaling, and observability.

That’s the value of this project: it proves the two philosophies can live together in one codebase.

Running it

# Clone and configure
git clone <your-repo>
cd HostedAgentDemo
cp .env.example .env
# Edit .env with FOUNDRY_PROJECT_ENDPOINT and MODEL_DEPLOYMENT_NAME

# Install dependencies
pip install -r requirements.txt

# Run locally
python main.py

# Test a conversation
curl http://localhost:8088/responses \
  -H "Content-Type: application/json" \
  -d '{"input": [{"role": "user", "content": "List the tools you can use"}]}'

# Deploy to Azure
azd ext install azure.ai.agents
azd up

Summary

Back in 2016 the industry was still arguing whether microservices were over-engineering. Today nobody doubts the value of service decoupling at scale.

I believe 2025–2026 is the “microservices moment” for Agent engineering — people are starting to realize that an Agent that couples reasoning, tool execution, and state memory inside a single monolithic container simply cannot keep pace with model evolution.

Anthropic’s Managed Agents supplies the architectural philosophy; Microsoft’s Foundry Hosted Agents supplies the enterprise infrastructure; and this open-source project shows that they are not an either/or choice — they are complementary, and they make each other better.

References

Published Apr 24, 2026
Version 1.0
No CommentsBe the first to comment