Blog Post

Linux and Open Source Blog
8 MIN READ

Agent Governance Toolkit: Architecture Deep Dive, Policy Engines, Trust, and SRE for AI Agents

mosiddi's avatar
mosiddi
Icon for Microsoft rankMicrosoft
Apr 09, 2026

Last week we announced the Agent Governance Toolkit on the Microsoft Open Source Blog, an open-source project that brings runtime security governance to autonomous AI agents. In that announcement, we covered the why: AI agents are making autonomous decisions in production, and the security patterns that kept systems safe for decades need to be applied to this new class of workload.

 

In this post, we'll go deeper into the how: the architecture, the implementation details, and what it takes to run governed agents in production.

The Problem: Production Infrastructure Meets Autonomous Agents

If you manage production infrastructure, you already know the playbook: least privilege, mandatory access controls, process isolation, audit logging, and circuit breakers for cascading failures. These patterns have kept production systems safe for decades.

Now imagine a new class of workload arriving on your infrastructure, AI agents that autonomously execute code, call APIs, read databases, and spawn sub-processes. They reason about what to do, select tools, and act in loops. And in many current deployments, they do all of this without the security controls you'd demand of any other production workload.

That gap is what led us to build the Agent Governance Toolkit: an open-source project, that applies proven security concepts from operating systems, service meshes, and SRE to the emerging world of autonomous AI agents.

To frame this in familiar terms: most AI agent frameworks today are like running every process as root, no access controls, no isolation, no audit trail. The Agent Governance Toolkit is the kernel, the service mesh, and the SRE platform for AI agents.

When an agent calls a tool, say, `DELETE FROM users WHERE created_at < NOW()`, there is typically no policy layer checking whether that action is within scope. There is no identity verification when one agent communicates with another. There is no resource limit preventing an agent from making 10,000 API calls in a minute. And there is no circuit breaker to contain cascading failures when things go wrong.

OWASP Agentic Security Initiative

In December 2025, OWASP published the Agentic AI Top 10: the first formal taxonomy of risks specific to autonomous AI agents. The list reads like a security engineer's nightmare: goal hijacking, tool misuse, identity abuse, memory poisoning, cascading failures, rogue agents, and more.

If you've ever hardened a production server, these risks will feel both familiar and urgent. The Agent Governance Toolkit is designed to help address all 10 of these risks through deterministic policy enforcement, cryptographic identity, execution isolation, and reliability engineering patterns.

Note: The OWASP Agentic Security Initiative has since adopted the ASI 2026 taxonomy (ASI01–ASI10). The toolkit's copilot-governance package now uses these identifiers with backward compatibility for the original AT numbering.

Architecture: Nine Packages, One Governance Stack

The toolkit is structured as a v3.0.0 Public Preview monorepo with nine independently installable packages:

Package

What It Does

Agent OS

Stateless policy engine, intercepts agent actions before execution with configurable pattern matching and semantic intent classification

Agent Mesh

Cryptographic identity (DIDs with Ed25519), Inter-Agent Trust Protocol (IATP), and trust-gated communication between agents

Agent Hypervisor

Execution rings inspired by CPU privilege levels, saga orchestration for multi-step transactions, and shared session management

Agent Runtime

Runtime supervision with kill switches, dynamic resource allocation, and execution lifecycle management

Agent SRE

SLOs, error budgets, circuit breakers, chaos engineering, and progressive delivery, production reliability practices adapted for AI agents

Agent Compliance

Automated governance verification with compliance grading and regulatory framework mapping (EU AI Act, NIST AI RMF, HIPAA, SOC 2)

Agent Lightning

Reinforcement learning training governance with policy-enforced runners and reward shaping

Agent Marketplace

Plugin lifecycle management with Ed25519 signing, trust-tiered capability gating, and SBOM generation

Integrations

20+ framework adapters for LangChain, CrewAI, AutoGen, Semantic Kernel, Google ADK, Microsoft Agent Framework, OpenAI Agents SDK, and more

Agent OS: The Policy Engine

Agent OS intercepts agent tool calls before they execute:

from agent_os import StatelessKernel, ExecutionContext, Policy

kernel = StatelessKernel()
ctx = ExecutionContext(
    agent_id="analyst-1",
    policies=[
        Policy.read_only(),                    # No write operations
        Policy.rate_limit(100, "1m"),          # Max 100 calls/minute
        Policy.require_approval(
            actions=["delete_*", "write_production_*"],
            min_approvals=2,
            approval_timeout_minutes=30,
        ),
    ],
)

result = await kernel.execute(
    action="delete_user_record",
    params={"user_id": 12345},
    context=ctx,
)

The policy engine works in two layers: configurable pattern matching (with sample rule sets for SQL injection, privilege escalation, and prompt injection that users customize for their environment) and a semantic intent classifier that helps detect dangerous goals regardless of phrasing. When an action is classified as `DESTRUCTIVE_DATA`, `DATA_EXFILTRATION`, or `PRIVILEGE_ESCALATION`, the engine blocks it, routes it for human approval, or downgrades the agent's trust level, depending on the configured policy.

Important: All policy rules, detection patterns, and sensitivity thresholds are externalized to YAML configuration files. The toolkit ships with sample configurations in `examples/policies/` that must be reviewed and customized before production deployment. No built-in rule set should be considered exhaustive. Policy languages supported: YAML, OPA Rego, and Cedar.

The kernel is stateless by design, each request carries its own context. This means you can deploy it behind a load balancer, as a sidecar container in Kubernetes, or in a serverless function, with no shared state to manage. On AKS or any Kubernetes cluster, it fits naturally into existing deployment patterns. Helm charts are available for agent-os, agent-mesh, and agent-sre.

Agent Mesh: Zero-Trust Identity for Agents

In service mesh architectures, services prove their identity via mTLS certificates before communicating. AgentMesh applies the same principle to AI agents using decentralized identifiers (DIDs) with Ed25519 cryptography and the Inter-Agent Trust Protocol (IATP):

from agentmesh import AgentIdentity, TrustBridge

identity = AgentIdentity.create(
    name="data-analyst",
    sponsor="alice@company.com",          # Human accountability
    capabilities=["read:data", "write:reports"],
)
# identity.did -> "did:mesh:data-analyst:a7f3b2..."

bridge = TrustBridge()
verification = await bridge.verify_peer(
    peer_id="did:mesh:other-agent",
    required_trust_score=700,  # Must score >= 700/1000
)

A critical feature is trust decay: an agent's trust score decreases over time without positive signals. An agent trusted last week but silent since then gradually becomes untrusted, modeling the reality that trust requires ongoing demonstration, not a one-time grant.

Delegation chains enforce scope narrowing: a parent agent with read+write permissions can delegate only read access to a child agent, never escalate.

Agent Hypervisor: Execution Rings

CPU architectures use privilege rings (Ring 0 for kernel, Ring 3 for userspace) to isolate workloads. The Agent Hypervisor applies this model to AI agents:

Ring

Trust Level

Capabilities

Ring 0 (Kernel)

Score ≥ 900

Full system access, can modify policies

Ring 1 (Supervisor)

Score ≥ 700

Cross-agent coordination, elevated tool access

Ring 2 (User)

Score ≥ 400

Standard tool access within assigned scope

Ring 3 (Untrusted)

Score < 400

Read-only, sandboxed execution only

New and untrusted agents start in Ring 3 and earn their way up, exactly the principle of least privilege that production engineers apply to every other workload.

Each ring enforces per-agent resource limits: maximum execution time, memory caps, CPU throttling, and request rate limits. If a Ring 2 agent attempts a Ring 1 operation, it gets blocked, just like a userspace process trying to access kernel memory.

These ring definitions and their associated trust score thresholds are fully configurable via policy. Organizations can define custom ring structures, adjust the number of rings, set different trust score thresholds for transitions, and configure per-ring resource limits to match their security requirements.

The hypervisor also provides saga orchestration for multi-step operations. When an agent executes a sequence, draft email → send → update CRM, and the final step fails, compensating actions fire in reverse. Borrowed from distributed transaction patterns, this ensures multi-agent workflows maintain consistency even when individual steps fail.

Agent SRE: SLOs and Circuit Breakers for Agents

If you practice SRE, you measure services by SLOs and manage risk through error budgets. Agent SRE extends this to AI agents:

When an agent's safety SLI drops below 99 percent, meaning more than 1 percent of its actions violate policy, the system automatically restricts the agent's capabilities until it recovers. This is the same error-budget model that SRE teams use for production services, applied to agent behavior.

We also built nine chaos engineering fault injection templates: network delays, LLM provider failures, tool timeouts, trust score manipulation, memory corruption, and concurrent access races. Because the only way to know if your agent system is resilient is to break it intentionally.

Agent SRE integrates with your existing observability stack through adapters for Datadog, PagerDuty, Prometheus, OpenTelemetry, Langfuse, LangSmith, Arize, MLflow, and more. Message broker adapters support Kafka, Redis, NATS, Azure Service Bus, AWS SQS, and RabbitMQ.

Compliance and Observability

If your organization already maps to CIS Benchmarks, NIST AI RMF, or other frameworks for infrastructure compliance, the OWASP Agentic Top 10 is the equivalent standard for AI agent workloads. The toolkit's agent-compliance package provides automated governance grading against these frameworks.

The toolkit is framework-agnostic, with 20+ adapters that hook into each framework's native extension points, so adding governance to an existing agent is typically a few lines of configuration, not a rewrite.

The toolkit exports metrics to any OpenTelemetry-compatible platform, Prometheus, Grafana, Datadog, Arize, or Langfuse. If you're already running an observability stack for your infrastructure, agent governance metrics flow through the same pipeline.

Key metrics include: policy decisions per second, trust score distributions, ring transitions, SLO burn rates, circuit breaker state, and governance workflow latency.

Getting Started

# Install all packages
pip install agent-governance-toolkit[full]

# Or individual packages
pip install agent-os-kernel agent-mesh agent-sre

The toolkit is available across language ecosystems: Python, TypeScript (`@microsoft/agentmesh-sdk` on npm), Rust, Go, and .NET (`Microsoft.AgentGovernance` on NuGet).

Azure Integrations

While the toolkit is platform-agnostic, we've included integrations that help enable the fastest path to production, on Azure:

Azure Kubernetes Service (AKS): Deploy the policy engine as a sidecar container alongside your agents. Helm charts provide production-ready manifests for agent-os, agent-mesh, and agent-sre.

Azure AI Foundry Agent Service: Use the built-in middleware integration for agents deployed through Azure AI Foundry.

OpenClaw Sidecar: One compelling deployment scenario is running OpenClaw, the open-source autonomous agent, inside a container with the Agent Governance Toolkit deployed as a sidecar. This gives you policy enforcement, identity verification, and SLO monitoring over OpenClaw's autonomous operations. On Azure Kubernetes Service (AKS), the deployment is a standard pod with two containers: OpenClaw as the primary workload and the governance toolkit as the sidecar, communicating over localhost. We have a reference architecture and Helm chart available in the repository.

The same sidecar pattern works with any containerized agent, OpenClaw is a particularly compelling example because of the interest in autonomous agent safety.

Tutorials and Resources

34+ step-by-step tutorials covering policy engines, trust, compliance, MCP security, observability, and cross-platform SDK usage are available in the repository.

git clone https://github.com/microsoft/agent-governance-toolkit
cd agent-governance-toolkit
pip install -e "packages/agent-os[dev]" -e "packages/agent-mesh[dev]" -e "packages/agent-sre[dev]"

# Run the demo
python -m agent_os.demo

What's Next

AI agents are becoming autonomous decision-makers in production infrastructure, executing code, managing databases, and orchestrating services. The security patterns that kept production systems safe for decades, least privilege, mandatory access controls, process isolation, audit logging, are exactly what these new workloads need. We built them. They're open source.

We're building this in the open because agent security is too important for any single organization to solve alone:

  • Security research: Adversarial testing, red-team results, and vulnerability reports strengthen the toolkit for everyone.
  • Community contributions: Framework adapters, detection rules, and compliance mappings from the community expand coverage across ecosystems.

We are committed to open governance. We're releasing this project under Microsoft today, and we aspire to move it into a foundation home, such as the AI and Data Foundation (AAIF), where it can benefit from cross-industry stewardship. We're actively engaging with foundation partners on this path.

The Agent Governance Toolkit is open source under the MIT license. Contributions welcome at github.com/microsoft/agent-governance-toolkit.

Updated Apr 09, 2026
Version 1.0
No CommentsBe the first to comment