framework
3 TopicsBuilding an Auditable Security Layer for Agentic AI
Most agent failures do not look like breaches. They look like a normal chat, a normal answer, and a normal tool call. Until the next morning, when a single question collapses the whole story: who authorized that action. You think you deployed an agent. In reality, you deployed an unbounded automation pipeline that happens to speak English. I’m Hazem Ali — Microsoft AI MVP, Distinguished AI & ML Architect, Founder & CEO at Skytells. For over 20 years, I’ve built secure, scalable enterprise AI across cloud and edge, with a focus on agent security and sovereign, governed AI architectures. My work on these systems is widely referenced by practitioners across multiple regions. Hazem Ali honored to receive an official speaker invitation under the patronage of H.H. Sheikh Dr. Sultan bin Muhammad Al Qasimi, Member of the UAE Supreme Council and Ruler of Sharjah, to speak at the Sharjah International Conference on Linguistic Intelligence (SICLI), organized by the American University of Sharjah (AUS) and the Emirates Scholar Center for Research and Studies. This piece is a collaboration with Hammad Atta a Practice Lead – AI Security & Cloud Strategy and Dr. Yasir Mehmood , Dr Muhammad Zeeshan Baig, Dr. Muhammad Aatif, Dr. MUHAMMAD AZIZ UL HAQ. We align on one core idea: agent security is not about making the model behave. It is about building enforceable boundaries around the model and proving every privileged step. This article is meant to sit next to my earlier Tech Community piece, Zero-Trust Agent Architecture: How To Actually Secure Your Agents, and go one level deeper into the mechanics you can implement on Azure today. Let me break it down. The Principle: The model is not your boundary Let me break it down in the way I’d explain it in a design review. A boundary is something that still holds when the component on the other side is adversarial, confused, or simply wrong. An LLM is none of those reliably. In an agent, the model is not just a generator. It becomes a planner and scheduler. It decides when to retrieve, which tool to call, how to shape arguments, and when to loop. That means your real attack surface is not “bad output.” It is the control-flow graph the model is allowed to traverse. So if your “security” lives inside the prompt, you are putting policy in the same token stream the attacker can influence. That is not a boundary. That is a suggestion. The only stable design is to treat the model like an untrusted proposer and the runtime like the verifier. Here is the chain I use. Each gate is external to the model and survives manipulation. Context Gate: Everything that enters the model is treated as executable influence, not “text.” Capability Gate: Tools are invoked as constrained capabilities, not free-form function calls. Evidence Gate: Every privileged step produces a verifiable artifact, not a story. Retrieval Control Plane: What the agent can see is governed by labels and identity, not prompt etiquette. Detection Layer: Drift and probing become alerts, not surprises. Now the rare part, the part most people miss: the boundary is not “block or allow.” The boundary is stateful. Once the runtime sees a suspicious signal, the entire session must transition into a degraded capability state, and every downstream gate must enforce that state. 1. Treat context as executable influence, and preserve provenance If you do RAG, your documents are not “supporting info.” They are an input channel. That makes the biggest prompt-injection risk not the user. It is your documents. Microsoft’s Prompt Shields covers user prompt attacks (scanned at the user input intervention point) and document attacks (scanned at the user input and tool response intervention points). When enabled, each request returns annotation results with detected and filtered values that your runtime can translate into a policy decision: block, degrade, or allow. Provenance Collapse. Most teams concatenate prompt + policy + retrieved chunks into one blob. The moment you do that, you lose the one thing you need for a defensible boundary: you can no longer reliably tell which tokens came from where. That is how “context” becomes “authority.” For indirect/document attacks, Microsoft guidance recommends delimiting context documents inside the prompt using """<documents> ... </documents>""" to improve indirect attack detection. That delimiter is not formatting. It is a provenance marker that improves indirect attack detection through Prompt Shields. Minimal, practical pattern: // Provenance-preserving prompt construction for indirect/document attack detection function buildPrompt(system: string, user: string, retrievedDocs: string[]): string { const docs = retrievedDocs.map((d) => `- ${d}`).join("\n"); return [ system, "", `User: ${user}`, "", `""" <documents>\n${docs}\n</documents> """`, ].join("\n"); } Then treat Prompt Shields output as a session security event, not a banner: type RiskState = "NORMAL" | "SUSPECT" | "BLOCK"; type FilterPolicy = "BLOCK_ON_FILTERED" | "DEGRADE_ON_FILTERED"; function computeRiskState( shields: { detected: boolean; filtered?: boolean }, labels: string[], policy: FilterPolicy = "DEGRADE_ON_FILTERED", ): RiskState { // detected => hard stop if (shields.detected) return "BLOCK"; // filtered is an annotation signal: block or degrade by policy if (shields.filtered) { return policy === "BLOCK_ON_FILTERED" ? "BLOCK" : "SUSPECT"; } // example: sensitivity-based degradation independent of shield hits const sensitive = labels.some((l) => ["Confidential", "HighlyConfidential", "Regulated"].includes(l), ); return sensitive ? "SUSPECT" : "NORMAL"; } When the signal is clear, you block and log. When it is suspicious, you do not warn. You downgrade authority. QSAF Alignment: Prompt Injection Protection (Domain 1): QSAF-PI-001 (static pattern blacklist), QSAF-PI-002 (dynamic LLM analysis), QSAF-PI-003 (semantic embedding comparison) All addressed by Prompt Shields and provenance marking. Context Manipulation (Domain 2): QSAF-RC-004 (context drift), QSAF-RC-007 (nested prompt injection) – mitigated by stateful risk calculation. 2. Tools are capabilities with constraints, not functions When the model proposes a tool call, your runtime should re-derive what is allowed from identity plus risk state, then enforce it at the gateway. type ToolRequest = { tool: string; args: unknown; }; type Capabilities = { allowWrite: boolean; allowedTools: Set<string>; }; function deriveCapabilities(risk: RiskState, roles: string[]): Capabilities { const baseAllowed = new Set(["search_kb", "get_profile", "summarize"]); const isAdmin = roles.includes("Admin"); if (risk === "SUSPECT") { return { allowWrite: false, allowedTools: baseAllowed }; } if (risk === "BLOCK") { return { allowWrite: false, allowedTools: new Set() }; } // NORMAL const tools = new Set([ ...baseAllowed, ...(isAdmin ? ["update_record", "issue_refund"] : []), ]); return { allowWrite: isAdmin, allowedTools: tools }; } function authorizeTool(req: ToolRequest, caps: Capabilities): void { if (!caps.allowedTools.has(req.tool)) throw new Error("ToolNotAllowed"); if (!caps.allowWrite && req.tool.startsWith("update_")) { throw new Error("WriteDenied"); } } The model can ask. It cannot grant itself permission. QSAF Alignment: Plugin Abuse Monitoring (Domain 3): QSAF-PL-001 (whitelist enforcement), QSAF-PL-003 (restrict sensitive plugins), QSAF-PL-006 (rate‑limiting) – implemented via capability derivation and gateway policies. Behavioral Anomaly Detection (Domain 5): QSAF-BA-006 (plugin execution pattern deviance) – detected by comparing actual calls against derived capabilities. The Integrity Gate: Hash-chain the authority, not the output Let me add the part that makes investigations clean. Most teams treat integrity like an audit log problem. That is not enough. Logs explain. Integrity proves. The hard truth is that agent authority is assembled out of pieces: the system instruction, the user prompt, retrieved chunks, risk annotations, and finally the tool intent. If you do not bind those pieces together cryptographically, an incident review becomes a story-telling session. This is why QSAF has an entire domain for payload integrity and signing, including prompt hash signing, nonce or replay protection, and a hash chain lineage that tracks how a session evolved. Here is how you can map that into the runtime verifies. You build a canonical “authority envelope” for every privileged hop, compute a digest, and then: link it to the previous hop (hash chain) include a nonce (replay control) sign the digest with Azure Key Vault (Key Vault signs digests, it does not hash your content for you) import crypto from "crypto"; type AuthorityEnvelope = { sessionId: string; turnId: number; policyVersion: string; // provenance-preserved components systemHash: string; userHash: string; documentsHash: string; // hash of structured retrieved chunks (not just rendered text) shields: { detected: boolean; filtered: boolean; }; riskState: "NORMAL" | "SUSPECT" | "BLOCK"; // proposed action (if any) tool?: { name: string; argsHash: string; }; // anti-replay + lineage nonce: string; prevDigest?: string; ts: string; }; function sha256(bytes: string): string { return crypto.createHash("sha256").update(bytes).digest("hex"); } // Canonicalization matters. JSON.stringify is OK if you control key order. // For cross-language, use RFC 8785 (JCS) canonical JSON. function canonicalJson(x: unknown): string { return JSON.stringify(x); } function buildEnvelope( input: Omit<AuthorityEnvelope, "nonce" | "ts">, ): AuthorityEnvelope { return { ...input, nonce: crypto.randomUUID(), ts: new Date().toISOString(), }; } function digestEnvelope(env: AuthorityEnvelope): string { return sha256(canonicalJson(env)); } Then you call Key Vault to sign that digest (REST sign), and optionally verify later (REST verify). The rare failure mode this blocks is subtle: authority splicing. Without a hash chain, it is possible for the runtime to correctly validate a tool call, but later be unable to prove which retrieved chunk, which Prompt Shields result, and which policy version were in force when that call was authorized. With the chain, every privileged hop becomes tamper-evident. This is the point: Prompt Shields tells you “this looks dangerous.” Document delimiters preserve provenance. The integrity gate makes the runtime able to say, later, with evidence: “This is exactly what I accepted as authority.” QSAF Alignment: Payload Integrity & Signing (Domain 6): QSAF-PY-001 (prompt hash signing), QSAF-PY-005 (nonce/replay control), QSAF-PY-006 (hash chain lineage) – directly implemented via the envelope and chaining. Tools must sit behind a wall that can say “no” Tool calls are where language becomes authority. If an agent can call APIs that mutate state, your security story is not about the response text. It is about whether the tool call is allowed under explicit policy. This is exactly where Azure API Management belongs: as the tool gateway that enforces authentication and authorization before any tool request reaches your backend. The validate-jwt policy is the canonical enforcement mechanism for validating JWTs at the gateway. The design goal is simple: The model can request a tool call. The gateway decides if it is permitted. A capability token approach keeps it clean: <!-- APIM inbound policy sketch --> <validate-jwt header-name="Authorization" failed-validation-httpcode="401"> <required-claims> <claim name="scp"> <value>tools.read</value> </claim> </required-claims> </validate-jwt> The claim name (scp, roles, or custom claims) depends on your token issuer; the point is enforcing authorization at the gateway, not inside model text. Now you can enforce “read-only mode” by issuing tokens that simply do not carry write scopes. The model can try to call a write tool. It still gets denied by policy. Evidence is not logs. Evidence is a signed chain. Logs help you debug. Evidence helps you prove. So you hash the session envelope and the tool intent, then sign the digest using Azure Key Vault Keys. Key Vault sign creates a signature from a digest, and verify verifies a signature against a digest. Key Vault does not hash your content for you. Hash locally, then sign the digest.), and Key Vault documentation is explicit that signing is sign-hash, not “sign arbitrary content.” You hash locally, then ask Key Vault to sign the hash. import crypto from "crypto"; const sha256 = (x: unknown): string => crypto.createHash("sha256").update(JSON.stringify(x)).digest("hex"); type IntentEnvelope = { sessionId: string; userId: string; promptHash: string; documentsHash: string; tool: string; argsHash: string; nonce: string; ts: string; policyVersion: string; }; function buildIntent( sessionId: string, userId: string, prompt: string, docs: unknown, tool: string, args: unknown, policyVersion: string, ): IntentEnvelope { return { sessionId, userId, promptHash: sha256(prompt), documentsHash: sha256(docs), tool, argsHash: sha256(args), nonce: crypto.randomUUID(), ts: new Date().toISOString(), policyVersion, }; } Once you do this, your system stops “explaining.” It starts proving. Govern what the agent can see, not only what it can say RAG without governance eventually becomes a data exposure feature. This is why I treat retrieval as a governed operation. Microsoft Purview sensitivity labels give you a practical way to classify content and build retrieval rules on top of that classification. Microsoft documents creating and configuring sensitivity labels in Purview. The pattern is simple: Label the corpus. Filter retrieval by label and identity policy. Log label distribution per completion. Alert when a low-privilege identity retrieves high-sensitivity labels. This is how you keep sovereignty real. Not in a slide deck. In the retrieval path. Operate it like a security system: posture and detection Inline gates reduce risk. They do not eliminate it. Systems drift. People add tools. Policies get loosened. Attacks evolve. Microsoft Defender for Cloud’s Defender CSPM plan includes AI security posture management for generative AI apps and AI agents (Preview), including discovery/inventory of AI agents deployed with Azure AI Foundry. Then you use Microsoft Sentinel to turn your telemetry into incidents, with scheduled analytics rules. Your detections should match the gates you built: Repeated Prompt Shields detections from the same identity or session. Tool-call spikes after a suspicious document signal. APIM denials for write endpoints from sessions in read-only mode. High-sensitivity label retrieval by identities that should never touch that tier. QSAF Alignment: Behavioral Anomaly Detection (Domain 5): QSAF-BA-001 (session entropy), QSAF-BA-004 (repeated intent mutation), QSAF-BA-007 (unified risk score) – detected via Sentinel rules. Cross‑Environment Defense (Domain 9): QSAF-CE-006 (coordinated alert response) – using Sentinel incidents and playbooks. Where the reference checklist fits, quietly Behind the scenes, we use a control checklist lens to ensure we cover prompt/context attacks, tool misuse, integrity, governance, and operational monitoring. The point is not to rename Microsoft features into framework terms. The point is to make the system enforceable and auditable using Azure-native gates. Closing Zero trust for agents is not a slogan. It is a build. Prompt Shields gives you a front gate for both user prompt attacks and document attacks, with clear annotations like detected and filtered. API Management gives you a tool boundary that can say “no” regardless of what the model tries, using validate-jwt. Signed intent gives you evidence, using Key Vault’s sign-hash semantics. Purview labels give you governed retrieval. Sentinel and Defender give you an operating model, not wishful thinking. If you want the conceptual spine and the architectural principles that frame this pipeline, start with my earlier Tech Community pieces, then come back here and implement the gates. Thanks for reading — Hazem Ali75Views0likes0CommentsBeginner's: Python Django Microsoft Learn and Video Series.
Django is a framework built specifically for this style of application and is designed to help overcome the most common challenges. This video series and Microsoft Learn Modules will walk you through the core concepts, from installation, to the administration page where business users can modify data, to displaying information to external users, and deploying to Azure. We'll see how we can work with data both locally and in the cloud, and use some of the most popular features of the framework. https://aka.ms/CreateWebsitesWithDjango2.4KViews0likes0CommentsMicrosoft Security Code Analysis (Private Preview)
First published on MSDN on Sep 26, 2018 Run security analysis tools in Azure DevOps PipelinesThe Secure Development Lifecycle (SDL) Guidelines recommend that teams perform static analysis during the implementation phase of your development cycle.2.2KViews1like0Comments