Blog Post

Linux and Open Source Blog
14 MIN READ

Shift-Left Governance for AI Agents: How the Agent Governance Toolkit Helps You Catch Violations

mosiddi's avatar
mosiddi
Icon for Microsoft rankMicrosoft
May 01, 2026

In part one of this series, we covered AGT’s runtime governance: the policy engine, zero-trust identity, execution sandboxing, and the OWASP Agentic AI risk mapping.

That post focused on what happens when an agent acts: policy evaluation at the moment a tool call fires, trust scoring when agents communicate, audit logging when decisions are made. Runtime governance is essential. But it is the last line of defense.

After that post went live, a pattern emerged in conversations with teams adopting AGT. The same question kept coming up: runtime checks are useful, but what about everything before production? We realized runtime governance was only half the story. So we went back and built tooling for every stage of your software development lifecycle, from the moment a developer saves a file to the moment an artifact ships to users.

Why Runtime Governance Is Not Enough

AI agents are a new class of workload. They reason about what to do, select tools, call APIs, read databases, and spawn sub-processes, often in loops that run without direct human oversight. The OWASP Agentic AI Top 10 (published December 2025) identifies risks like excessive agency, insecure tool use, privilege escalation, and supply chain compromise. These risks span the entire lifecycle, not just runtime.

Consider a few scenarios that runtime governance alone cannot prevent:

  • A developer commits a policy YAML file with a typo that silently disables all deny rules. The agent runs unprotected until someone notices.
  • A dependency update introduces a package with a known critical CVE. The agent starts using a vulnerable library before any security team reviews it.
  • A contributor adds a raw cryptographic import to an application module, bypassing the security-audited signing library. The code compiles and ships.
  • A GitHub Actions workflow uses an expression injection pattern that allows an attacker to execute arbitrary code in CI.
  • A release ships without a Software Bill of Materials (SBOM), making it impossible to trace which components are affected when the next log4j-style vulnerability drops.

Each of these is a governance failure, but none of them happens at runtime. They happen at commit time, at PR review time, at build time, or at release time. A comprehensive governance strategy needs coverage at every stage.

Four Stages of Pre-Runtime Governance

Governance violations can enter a codebase at four distinct stages of the development lifecycle. Each stage has a different class of risk, and each needs a different kind of check:

Stage

When It Runs

What It Catches

AGT Tooling

Commit-time

Before code leaves the developer machine

Malformed policies, schema violations, secrets, stub code, unauthorized crypto

Pre-commit hooks, quality gates

PR-time

When a pull request is opened or updated

Vulnerable dependencies, missing attestation, secrets in history, unpinned versions

GitHub Actions (attestation, dependency review, secret scanning, supply chain checks)

CI/Build-time

On every push and pull request to main

Compliance violations, binary security issues, dependency confusion, workflow injection

Governance Verify action, Security Scan action, CodeQL, BinSkim, policy validation

Release-time

Before artifacts are published

Missing provenance, unsigned artifacts, incomplete SBOMs

SBOM generation, Sigstore signing, build attestation, OpenSSF Scorecard

Just as with bugs, the earlier you catch a governance violation, the cheaper it is to fix. A malformed policy file caught at commit time costs zero CI minutes. A secret caught in PR review never reaches the default branch. A dependency confusion attack blocked in CI never reaches production. An unsigned artifact blocked at release time never reaches users.

Stage 1: Commit-Time Governance with Pre-Commit Hooks

The fastest governance feedback loop is local. Within the AGT project, we’ve implemented three pre-commit hooks that run automatically whenever a developer stages files for commit, validating governance artifacts before they ever leave the developer's machine.

Built-In Hooks

The toolkit's .pre-commit-hooks.yaml defines three hooks that any repository can adopt:

Hook ID

What It Validates

File Pattern

validate-policy

YAML/JSON policy files against the AGT policy schema, checking for required fields, valid operators, and structural correctness

Files matching *polic*.yaml, *polic*.yml, *polic*.json

validate-plugin-manifest

Plugin manifest files for required fields and schema compliance

Files matching plugin.json, plugin.yaml, plugin.yml

evaluate-plugin-policy

Plugin manifests against a governance policy file, evaluating whether the plugin would be allowed under the organization's rules

Files matching plugin.json, plugin.yaml, plugin.yml

 

To adopt these hooks, add AGT as a pre-commit hook source:

# .pre-commit-config.yaml
repos:
  - repo: https://github.com/microsoft/agent-governance-toolkit
    rev: main  # pin to a release tag in production
    hooks:
      - id: validate-policy
      - id: validate-plugin-manifest
      - id: evaluate-plugin-policy
        args: ['--policy', 'policies/marketplace-policy.yaml']

Then install and run:

pip install pre-commit
pre-commit install
pre-commit run --all-files

Extended Quality Gates

Beyond schema validation, we built a pre-commit rollout template (see the full example in the repository) with additional governance-specific quality gates designed to help prevent common security anti-patterns from entering the codebase:

  • Policy validation (agt-validate): Runs the full AGT policy CLI in strict mode, catching not just schema errors but semantic issues like conflicting rules.
  • Health check (agt-doctor): Runs on pre-push (before code leaves the machine entirely), performing a broader health check of the governance configuration.
  • Plugin metadata check (agency-json-required): Ensures every plugin directory contains the required agency.json metadata file.
  • Stub detection (no-stubs): Blocks TODO, FIXME, HACK, and raise NotImplementedError markers in staged production code. Test files are excluded.
  • Unauthorized crypto detection (no-custom-crypto): Blocks raw cryptographic imports (hashlib, hmac, crypto.subtle, System.Security.Cryptography, ring, ed25519-dalek) outside designated security modules. This helps ensure all cryptographic operations go through the audited AGT signing libraries.
  • Secret scanning (detect-secrets): Integrates Yelp's detect-secrets for pattern-based secret detection on every commit.

Phased Rollout for Teams

Adopting pre-commit hooks across a team requires a thoughtful rollout. The AGT documentation includes a phased adoption guide:

  1. Week 1: Install hooks in permissive mode. Hooks warn on violations but do not block the commit. This lets developers see what would be caught without disrupting workflow.
  2. Week 2: Switch to strict mode for policy validation only. Policy files must pass schema validation to be committed.
  3. Week 3: Enable all hooks as blocking. Stubs, unauthorized crypto, and secrets are now blocked at commit time.
  4. Week 4: Graduate to full blocking mode and remove the permissive fallback.

This approach helps teams build confidence in the governance tooling before it becomes a hard gate.

Stage 2: PR-Time Gates

Pre-commit hooks catch issues on the developer's machine, but they can be bypassed (force push, direct GitHub edits, hooks not installed). PR-time gates provide the second layer of defense, running in GitHub Actions on every pull request before merge is allowed.

Governance Attestation

The Governance Attestation action validates that PR authors have completed a structured attestation checklist before their code can merge. The default checklist covers seven sections:

  1. Security review
  2. Privacy review
  3. Legal review
  4. Responsible AI review
  5. Accessibility review
  6. Release Readiness / Safe Deployment
  7. Org-specific Launch Gates

The action is fully configurable. Organizations can customize the required sections, set a minimum PR body length, and choose their own attestation format. Outputs include the validation status, a list of errors for missing sections, and a JSON mapping of sections to checkbox counts.

Here is an example workflow:

# .github/workflows/pr-governance.yml
name: PR Governance
on:
  pull_request:
    types: [opened, edited, synchronize]

jobs:
  attestation:
    runs-on: ubuntu-latest
    steps:
      - uses: microsoft/agent-governance-toolkit/action/governance-attestation@main
        with:
          required-sections: |
            1) Security review
            2) Privacy review
            3) Responsible AI review

Dependency Review

The dependency review workflow helps block PRs that introduce dependencies with known CVEs or disallowed licenses. It uses the GitHub dependency-review-action with a curated license allowlist:

- uses: actions/dependency-review-action@v4
  with:
    fail-on-severity: moderate
    comment-summary-in-pr: always
    allow-licenses: >
      MIT, Apache-2.0, BSD-2-Clause, BSD-3-Clause, ISC,
      PSF-2.0, Python-2.0, 0BSD, Unlicense, CC0-1.0,
      CC-BY-4.0, Zlib, BSL-1.0, MPL-2.0

This runs on every PR that touches dependency manifests (package.json, Cargo.toml, pyproject.toml, requirements.txt). Dependencies with moderate or higher CVEs are flagged, and dependencies with licenses not on the allowlist are blocked.

Secret Scanning

The secret scanning workflow runs on every PR to the main branch and on a weekly schedule. It combines two complementary approaches:

  • Gitleaks: Pattern-based secret detection across the full git history, catching API keys, tokens, and credentials that may have been committed at any point.
  • High-entropy string scanning: Regex-based detection of common secret patterns including GitHub tokens (ghp_, gho_), AWS access keys (AKIA), Slack tokens (xox), and base64-encoded strings with high entropy.

Supply Chain Integrity

A dedicated supply chain check workflow triggers when dependency manifest files change. It enforces two rules that help prevent supply chain attacks:

  • Exact version pinning: No ^ or ~ version ranges in package.json files. This prevents unexpected minor/patch version updates that could introduce compromised code.
  • Lockfile presence: Every package directory with dependencies must have a corresponding lockfile (package-lock.json, pnpm-lock.yaml, or yarn.lock). Lockfiles help ensure reproducible builds with verified integrity hashes.

Quality Gates

The quality gates workflow mirrors the pre-commit hooks at the PR level, providing defense in depth. It runs four checks on every pull request:

Gate

Purpose

No Stubs/TODOs

Blocks TODO, FIXME, HACK markers in production code (test files excluded)

No Unauthorized Crypto

Blocks raw cryptographic imports outside designated security modules

Security Audit Required

Changes to security-sensitive paths require accompanying audit documentation

Dependency Audit Trail

Vendored patches must have an audit trail explaining the patch and its provenance

 

These gates catch anything that bypasses pre-commit hooks: force-pushed commits, direct GitHub web edits, commits from contributors who have not installed the hooks.

Stage 3: CI/Build-Time Governance

Once a PR passes the gate workflows, the main CI pipeline and specialized workflows perform deeper, more computationally intensive analysis.

The Governance Verify Action

The Governance Verify action is the primary CI-time governance check. It is a GitHub Actions composite action that installs the toolkit and runs the compliance CLI against your repository. It supports four modes:

Command

What It Does

governance-verify

Runs the full compliance verification suite, checking governance controls and reporting how many pass

marketplace-verify

Validates a plugin manifest against marketplace requirements (required fields, signing, metadata)

policy-evaluate

Evaluates a specific policy file against a JSON context, returning the allow/deny decision with the matched rule

all

Runs governance-verify, then marketplace-verify and policy-evaluate if the corresponding paths are provided

 

Here is an example:

# .github/workflows/governance-ci.yml
name: Governance CI
on: [push, pull_request]

jobs:
  verify:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: microsoft/agent-governance-toolkit/action@main
        with:
          command: all
          policy-path: policies/
          manifest-path: plugin.json
          output-format: json
          fail-on-warning: 'true'

The action outputs structured data including controls-passed, controls-total, violations count, and full command output in JSON format. This makes it straightforward to integrate with dashboards, Slack notifications, or downstream decision logic.

The Security Scan Action

A separate security scan action scans directories for secrets, CVEs, and dangerous code patterns. Unlike the PR-time secret scanning (which focuses on git history), this action performs deep content analysis of the current codebase:

- uses: microsoft/agent-governance-toolkit/action/security-scan@main
  with:
    paths: 'plugins/ scripts/'
    min-severity: high
    exemptions-file: .security-exemptions.json

The action supports configurable severity thresholds (critical, high, medium, low), an exemptions file for acknowledged findings, and structured JSON output with findings-count, blocking-count, and detailed findings.

Policy Validation Workflow

A dedicated policy validation workflow triggers whenever YAML files or the policy engine source code changes. It performs two jobs in sequence:

  1. Validate policies: Discovers all policy files matching the *policy* naming convention, then validates each file using the AGT policy CLI.
  2. Test policies: Runs the policy CLI unit tests to verify that policy evaluation behavior is correct after the changes.

This ensures that policy file edits do not break the policy engine and that policy semantics are preserved.

CodeQL and Static Analysis

AGT uses GitHub's CodeQL for semantic static analysis of Python and TypeScript code. The CodeQL workflow runs on pushes and PRs, performing deep dataflow analysis that goes beyond pattern matching. Results are uploaded as SARIF to GitHub's Security tab, providing a centralized view of code quality issues.

Dependency Confusion Scanning

A dedicated CI job runs a dependency confusion scanner on every build. This is a targeted defense against a specific supply chain attack vector where an attacker registers a public package with the same name as an internal package. The scanner checks that:

  • Internal package names do not collide with public PyPI or npm packages
  • Notebook pip install commands only reference packages that are registered and expected

Workflow Security Auditing

When GitHub Actions workflow files change, a workflow security job scans for common CI/CD security issues:

  • Expression injection: Detects patterns like ${{ github.event.pull_request.title }} used directly in run: blocks, which can allow arbitrary code execution.
  • Overly permissive permissions: Flags workflows that request more permissions than necessary.
  • Unpinned action references: Detects actions referenced by branch name instead of commit SHA, which is a supply chain risk.

.NET Binary Analysis with BinSkim

For the .NET SDK (Microsoft.AgentGovernance), the CI pipeline runs Microsoft BinSkim binary security analysis on compiled assemblies. BinSkim checks for security-relevant compiler and linker settings in compiled binaries, such as DEP (Data Execution Prevention), ASLR (Address Space Layout Randomization), and stack protection. Results are uploaded as SARIF to GitHub code scanning alongside the CodeQL results.

The ci-complete Gate Pattern

With many CI jobs that conditionally run based on path filters, AGT uses a pattern called ci-complete: a single gate job that is configured as the sole required status check in branch protection. This job runs unconditionally (if: always()), depends on all other CI jobs, and checks that none of them failed. Jobs that were skipped (because no relevant files changed) are acceptable. This pattern ensures that branch protection works correctly with conditional CI jobs, preventing the common issue where skipped jobs report as "skipped" and fail required status checks.

Language-Specific Compile-Time Enforcement

Beyond the language-agnostic CI checks, each AGT SDK uses its language's native compiler and tooling to enforce governance standards at compile time.

.NET: The Strictest Compile-Time Checks

The .NET SDK (Microsoft.AgentGovernance) enforces compile-time governance through MSBuild properties in Directory.Build.props and Directory.Build.targets, which apply automatically to every project in the SDK:

Feature

MSBuild Property

Effect

Nullable reference types

<Nullable>enable</Nullable>

The compiler warns on every possible null dereference, helping prevent NullReferenceException at compile time

Warnings as errors

<TreatWarningsAsErrors>true

All compiler warnings become build errors for packable projects; no warnings can be shipped to consumers

Strong-name signing

<SignAssembly>true</SignAssembly>

Assemblies are signed with a strong-name key (AgentGovernance.snk), enabling identity verification

Deterministic builds

<ContinuousIntegrationBuild>true

Identical source code produces bit-for-bit identical binaries in CI, enabling build verification

SourceLink

Microsoft.SourceLink.GitHub package

Users can step into AGT source code when debugging, supporting transparency and auditability

Symbol packages

<IncludeSymbols>true</IncludeSymbols>

.snupkg symbol packages are published alongside NuGet packages for debugging support

TypeScript: Strict Compilation and Linting

The TypeScript SDK (@microsoft/agentmesh-sdk) uses strict compiler settings and ESLint for build-time governance:

  • Strict mode ("strict": true in tsconfig.json) enables all strict type-checking options, including noImplicitAny, strictNullChecks, strictFunctionTypes, and strictBindCallApply.
  • Consistent file naming (forceConsistentCasingInFileNames) prevents cross-platform issues where imports work on case-insensitive file systems (Windows, macOS) but fail on case-sensitive ones (Linux CI).
  • Declaration generation (declaration: true with declarationMap: true) produces .d.ts files for consumers, enabling downstream type checking.
  • ESLint with @typescript-eslint provides static analysis during the build process, catching issues beyond what the TypeScript compiler checks.

Python: Type Safety and Fast Linting

Python packages in AGT use typed package markers and static analysis tooling configured in pyproject.toml:

  • py.typed marker: Each package includes a py.typed file, signalling to type checkers (mypy, pyright, Pylance) that the package supports type checking. Consumers get type errors if they misuse the AGT API.
  • mypy: Configured as a dev dependency with project-specific settings in pyproject.toml. Provides static type checking that catches type mismatches before runtime.
  • ruff: A fast Python linter written in Rust, configured in pyproject.toml and enforced in CI. Ruff checks for hundreds of code quality rules at build time.

Stage 4: Release-Time Gates

Before artifacts reach users, the release pipeline adds a final layer of verification. These gates help ensure that what ships is exactly what was built, is signed by the expected publisher, and has a complete inventory of its components.

Gate

Tool

What It Produces

SBOM generation

Anchore/Syft

SPDX and CycloneDX software bills of materials listing every component, dependency, and licence

Python signing

Sigstore

Cryptographic signature using OpenID Connect identity, verifiable without manual key distribution

.NET signing

RELEASE PIPELINE

Microsoft Authenticode and NuGet signing through the release pipeline

Build provenance

actions/attest-build-provenance

SLSA provenance attestation linking the artifact to its source commit and build environment

SBOM attestation

actions/attest-sbom

Binds the SBOM to the specific release artifact, creating a verifiable link between the inventory and the binary

 

Additionally, the OpenSSF Scorecard runs on schedule, providing an automated security posture assessment that covers branch protection, dependency management, CI/CD practices, and more. The score is published to the OpenSSF Scorecard website, giving consumers a transparent view of the project security practices.

How It All Fits Together: Defense in Depth

This approach follows a defense-in-depth principle: every check exists at multiple layers, so that bypassing one layer does not compromise the whole system.

Secret scanning, for example, runs at three levels: detect-secrets at commit time (pre-commit hook), Gitleaks at PR time (secret scanning workflow), and the Security Scan action at CI time (content analysis). A developer who bypasses pre-commit hooks will still be caught by the PR-time gate. A contributor who force-pushes past the PR gate will still be caught by the CI pipeline.

Similarly, policy validation runs at commit time (validate-policy hook), at PR time (quality gates), and at CI time (policy validation workflow). Each layer adds depth: the commit-time hook catches schema errors, the CI pipeline catches semantic issues and runs regression tests.

The ci-complete gate job ties everything together. By depending on every CI job and serving as the single required status check, it ensures that no code merges to the main branch unless every applicable check has passed.

Getting Started

You can adopt AGT's shift-left governance incrementally. Here are three starting points, from lowest to highest effort:

1. Add the Governance Verify Action (5 minutes)

Add a single GitHub Actions workflow that runs the compliance check on every PR:

# .github/workflows/governance.yml
name: Governance
on: [pull_request]
jobs:
  verify:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: microsoft/agent-governance-toolkit/action@main
        with:
          command: governance-verify

2. Enable Pre-Commit Hooks (15 minutes)

Add a .pre-commit-config.yaml referencing AGT's hooks, install them, and run against all existing files to establish a baseline. Start in permissive mode and graduate to strict over four weeks.

3. Full Pipeline Integration (1-2 hours)

Add the complete set of PR-time gates (attestation, dependency review, secret scanning, supply chain checks, quality gates), configure the Security Scan action for your plugin directories, and enable SBOM generation and signing in your release workflow. The AGT repository itself serves as a reference implementation: every workflow described in this post is running in production at aka.ms/agent-governance-toolkit.

Important Notes

The policy files, workflow configurations, and code samples in this post are illustrative examples. Your organization's governance requirements may differ. Review and customize all configurations before deploying to production. The Agent Governance Toolkit is designed to help organizations implement governance controls for AI agents; it does not guarantee compliance with any specific regulatory framework. Always consult your organization's security and legal teams when defining governance policies.

What Comes Next

Pre-runtime governance is one piece of the puzzle. Combined with the runtime governance capabilities covered in part one of this series (policy engines, zero-trust identity, execution sandboxing, audit logging), it provides coverage across the full lifecycle.

The project continues to grow. Since the initial release, we’ve added a multi-stage policy pipeline (pre_input, pre_tool, post_tool, pre_output stages), approval workflows with human-in-the-loop gates, DLP attribute ratchets for monotonic session state, and OpenTelemetry instrumentation for governance operations. Over 45 step-by-step tutorials are available in the documentation.

Everything described in this post is available today in the public GitHub repository. The full source, documentation, tutorials, and examples are at aka.ms/agent-governance-toolkit, open source under the MIT license. We welcome contributions, feedback, and issue reports from the community.

Published May 01, 2026
Version 1.0
No CommentsBe the first to comment