Shift left on infrastructure security by embedding a custom Copilot agent directly in your Terraform workflow
GitHub Copilot is already a powerful coding assistant, but out of the box it knows nothing specific about your project's conventions, security requirements, or operational processes. Custom agents change that. They let you define specialized AI assistants that live inside your repository, carry deep domain expertise, and behave consistently for every developer on your team.
This blog explains what VS Code custom agents are, what they can do, and how to build one from scratch. While the concepts apply broadly to any development workflow, this post focuses specifically on Azure infrastructure teams using Terraform and demonstrates the approach through a practical example: An AI-powered security scanner for Terraform IaC modules.
What are VS Code custom agents?
Starting with VS Code 1.99+, GitHub Copilot supports custom agents markdown files stored in your repository under .github/agents/. Each file defines a specialized AI assistant with its own:
- Name and description: who this agent is and when to invoke it
- Model selection: which AI model powers it
- Tool permissions: what actions it can take (read files, search, run commands)
- Instructions: a system prompt that defines its expertise, behavior, and constraints
When you open a workspace containing these files, the agents appear as selectable options in the Copilot Chat panel. You can invoke them by selecting from the agent picker or typing @AgentName in chat.
Think of custom agents as specialized team members you define once and every developer gets automatically when they clone the repository - a security reviewer, a code quality enforcer, a documentation generator, a deployment helper each with deep knowledge of their specific domain.
How custom agents differ from regular Copilot chat?
| Aspect | Regular Copilot Chat | Custom Agent |
|---|---|---|
| Knowledge scope | General programming knowledge | Domain-specific expertise you define |
| Consistency | Varies by prompt phrasing | Consistent behavior across all users |
| Tool access | Context-dependent | Explicitly defined per agent |
| Invocation | Open chat | Named agent with focused scope |
| Portability | Per-user | Shared via repository |
| Constraints | None by default | You define guardrails (e.g., no file edits) |
A regular Copilot chat session might give different answers about security best practices depending on how you phrase the question. A custom security agent gives consistent, structured findings every time because its behavior is defined in code you control.
Anatomy of a custom agent file:
A custom agent is a single markdown file with two parts:
Part 1: YAML frontmatter (metadata):
- name: MyAgent description: "What this agent does and when to invoke it use keywords that match how users would naturally ask for help"
- model: Claude Sonnet 4.5 (copilot)
- tools: [read, search, execute]
- argument-hint: "Hint text shown in the chat input when this agent is selected"
Part 2: Markdown body (instructions):
Everything after the frontmatter is the system prompt - the instructions that shape every response. This is where you define:
- The agent's role and expertise
- What it should and should not do
- How it should structure its output
- Domain-specific knowledge it should apply
The instructions can be as detailed as needed. Unlike a one-off prompt, these instructions are permanent and version-controlled alongside your code.
Frontmatter fields explained:
- Name:
The agent's identifier. Appears in the agent picker dropdown and in @mentions. Use a clear, descriptive name without spaces. - Description:
This is more than a label; Copilot uses the description to determine when to suggest this agent. Include keywords that match natural language users would type: "security", "scan", "review", "deploy", "validate". The more specific, the better. - Model:
Which AI model powers the agent. Different models have different strengths:
| Model | Best For |
|---|---|
| Claude Sonnet 4.5 | Code analysis, security review, structured output |
| GPT-4o | General reasoning, broad knowledge |
| o3-mini | Fast responses, simple tasks |
You choose the model that best fits the agent's job.
- Tools:
What the agent can do. Tool selection is a security and capability decision:
| Tool | Capability | Use When |
|---|---|---|
| ead | Read files in the workspace | Agent needs to analyze code |
| search | Search across workspace files | Agent needs to find files by name or content |
| execute | Run terminal commands | Agent needs to run scripts or tools |
| editFiles | Create or modify files | Agent should write or change code |
Grant only what the agent needs. A read-only reviewer agent should never have editFiles. An agent that only answers questions needs only ead.
- Argument-hint:
The placeholder text in the chat input when this agent is selected. Helps users understand what to type: "Specify a folder to scan or 'all' for entire workspace".
What can custom agents do?
Custom agents work well for any repetitive expert judgment task, some of the common examples include:
| Use Case | What the Agent Does |
|---|---|
| Code Review | Reviews code for quality issues, anti-patterns, and naming violations with line-level findings |
| Security Scanning | Checks infrastructure or application code against security baselines (CIS, NIST) with remediation guidance |
| Documentation | Reads source code and generates API references, runbooks, or architecture summaries in your team's format |
| Onboarding | Answers questions about codebase conventions and patterns grounded in the actual repository |
| Deployment / Ops | Guides engineers through deployment or incident response using your actual infrastructure config |
| Testing | Reviews test coverage and suggests missing cases based on code changes |
| Release Management | Prepares release notes and version decisions from changelogs and git history |
Prerequisites to get started:
| Requirement | Details |
|---|---|
| VS Code | Version 1.99 or later |
| GitHub Copilot | Active subscription (Individual, Business, or Enterprise) |
| Copilot Chat extension | Installed and signed in to GitHub |
| Agent mode enabled | VS Code Settings > search "chat agent" |
| A repository | Agents live in .github/agents/ that is any local folder works |
No additional extensions, frameworks, or infrastructure required. Agents are just markdown files.
Building the IaC security scanner: A step-by-step guide
In General, the teams writing Terraform modules for Azure infrastructure need to ensure:
- RBAC roles follow least privilege (no Owner/Contributor assigned broadly)
- Network rules do not allow unrestricted inbound traffic
- Encryption is enforced with TLS 1.2 minimum
- Diagnostic logging is configured for audit trails
- Resource locks protect production resources from accidental deletion
These checks are typically done in CI/CD pipelines but that creates a slow feedback loop. A custom Copilot agent brings these checks into the IDE, giving developers security feedback while they write code.
Step 1: Create the directory:
Create .github/agents/ in your repository root if it does not already exist.
Step 2: Create the agent file:
name: IaCSecurityAgent description: "Scan Terraform and IaC files for security misconfigurations, insecure defaults, and compliance violations. Detects public endpoints, weak IAM, missing encryption, network exposure, and logging gaps. Use when user asks to check security, find misconfigurations, security review, or harden infrastructure"
model: Claude Sonnet 4.5 (copilot)
tools: [read, search, execute]
argument-hint: "Specify directory to scan (e.g., 'resource-groups'), multiple directories (e.g., 'resource-groups, nsg'), or 'all' for entire workspace"
- Why Claude Sonnet 4.5?
This model was chosen for its strong code analysis, ability to reason about security context (not just pattern-match), and consistent structured output. - Why execute?
The agent saves reports by calling a helper PowerShell script. This eliminates a separate user-triggered step. - Why not editFiles?
When the agent reports findings, it does not fix them unless the user explicitly asks. This keeps the agent in an advisory role and prevents unintended changes.
Step 3: Open VS Code and test:
- Open the Copilot Chat panel (Ctrl+Alt+I)
- Click the agent picker (the @ icon or agent name area)
- Your new agent should appear in the list
- Select it and type: scan resource-groups
Step 4: Iterate on the instructions:
The instructions are just text and so anyone can easily edit them, commit the changes, and the agent behavior updates immediately for everyone on the team. Treat agent instructions like code: review, create versions and improve them over time.
What the agent checks:
The above agent's instructions in the name field define six security domains it checks against every Terraform resource:
- Identity and Access Management (IAM):
- Overly permissive RBAC roles (Owner, Contributor at broad scope)
- Missing managed identity configuration (using keys instead)
- Hardcoded credentials or secrets
- Missing validation on role assignment variables
- Network Security:
- Public endpoints on databases, storage, Key Vaults
- Admin ports (22, 3389) open to 0.0.0.0/0
- Missing private endpoints for PaaS services
- NSG rules allowing wildcard source addresses
- Data Protection and Encryption:
- Encryption at rest disabled
- TLS version below 1.2
- HTTPS not enforced
- Secrets stored in plain text in variables
- Logging and Monitoring:
- Missing azurerm_monitor_diagnostic_setting resources
- Log retention below 90 days
- No audit logging on Key Vault, SQL, or AKS
- Container and Workload Security:
- AKS without RBAC enabled
- Local accounts not disabled
- Missing network policy configuration
- Backup and Disaster Recovery:
- Key Vault without purge protection
- Missing soft delete configuration
- No geo-redundancy for critical data
Compliance framework alignment:
Findings are mapped to Azure-relevant controls:
- CIS Azure Foundations Benchmark (e.g., CIS 3.7 for storage public access, CIS 6.1 for NSG rules)
- Azure Security Benchmark v3 (e.g., NS-1 for network segmentation, PA-7 for privileged access, DP-4 for encryption)
- NIST 800-53 (e.g., SC-7 for boundary protection, AC-6 for least privilege)
Choosing the right scanning scope:
The agent supports flexible scope: single folder, multiple folders, or entire workspace auto-discovery. When a user says "scan all", the agent searches for every .tf file, groups them by directory, and scans each independently.
The structured security scan output:
Every finding follows a consistent format. Here is an example of a security scan result:
[MEDIUM] IAM-002: Missing principal_type default recommendation
- File: user-assigned-identity/variables.tf (Line 45)
- Resource: var.rg_role_assignments.principal_type
- Issue: principal_type is optional with null default. In environments with ABAC policies, role assignments may fail if this is not explicitly set.
- Impact: Role assignments could fail silently or be mis-scoped in ABAC-constrained environments.
- Compliance: Azure Security Benchmark PA-7
Results from real scans:
Security Scan:
Scanning three Azure Terraform modules with custom agent produced the following results:
| Module | CRITICAL | HIGH | MEDIUM | LOW | Key Finding |
|---|---|---|---|---|---|
| resource-groups | 0 | 1 | 3 | 2 | Role assignments allow Owner/Contributor |
| nsg | 0 | 1 | 3 | 2 | Wildcard source addresses and ports not blocked |
| user-assigned-identity | 0 | 1 | 3 | 2 | Managed identity lacks role_assignments field — permissions must be set manually post-creation |
Generated Security scan report:
All findings included exact file paths, line numbers, and Terraform code fixes.
The companion quality scanner:
Alongside the security agent, the workspace includes a second agent: a Super-Linter Scanner that runs native static analysis tools:
| Tool | Version | Purpose |
|---|---|---|
| TFLint | v0.53.0 | Naming conventions, unused declarations, provider pinning |
| terraform fmt | v1.9.8 | Code formatting validation |
| yamllint | latest | YAML syntax and style |
| PSScriptAnalyzer | latest | PowerShell best practices |
This agent calls a PowerShell script that produces SARIF output (viewable inline in VS Code via the SARIF Viewer extension) and an HTML report. Tool versions are pinned to match the CI/CD pipeline's super-linter commit, so local results are consistent with what CI would produce.
Why agent-based scanning goes beyond traditional tools?
Traditional static analysis tools like tfsec, Checkov, or tflint work by matching code patterns against a database of rules. They catch what they know about. The AI agent adds a layer of reasoning:
- It can recognize that a variable accepting any role name is dangerous even when no bad value is currently assigned the vulnerability is the missing validation, not an existing misconfiguration.
- It can correlate findings across files (a storage account in one file, its network rules in another).
- It maps findings to compliance frameworks without you maintaining a rule-to-control mapping table.
- It produces natural language explanations of why something is a problem, not just a rule ID.
This does not replace deterministic tools, but it complements them. Use both.
Key takeaways:
- Custom VS Code Copilot agents are markdown files in .github/agents/ with no extension development, no deployment, no infrastructure required.
- The YAML frontmatter controls model selection, tool permissions, and how Copilot decides when to suggest the agent.
- The markdown body is your system prompt, treat it like code: version, review and iterate on it.
- Tool permissions are a security decision: grant only what the agent needs.
- Custom agents are portable that means anyone who clones the repository gets the agents automatically.
- Combining AI reasoning with deterministic tools (tflint, terraform fmt) provides coverage neither can achieve alone.
- The agent pattern applies far beyond security scanning such as documentation, onboarding, deployment, testing, compliance.