Responses API
1 TopicBuilding Secure, Multi-User AI Workflows with the Responses API
With the recent GA (General Availability) of the Responses API, developers and enterprises now have access to a production-ready service purpose-built for stateful, multi-turn, tool-using AI agents. This milestone means you can confidently integrate the Responses API into real-world applications, knowing it’s fully supported, scalable, and designed for enterprise-grade use cases. Unlike traditional stateless APIs like Chat Completions, the Responses API maintains conversation history, supports tool orchestration, and enables multi-modal interactions. It’s ideal for building intelligent agents that need to remember context, call external tools, and interact with users over time. The Challenge: Securing AI Responses in Multi-User Environments As AI becomes more deeply embedded in enterprise apps, a new challenge emerges: response leakage. In multi-user environments, any user with a response ID could potentially access content they didn’t create—posing serious risks to privacy, data ownership, and compliance. By default, the Responses API allows retrieval of any response if you have the response ID. While this is convenient for prototyping, it’s not secure for production. There’s no built-in mechanism to verify who is making the request or whether they’re authorized to access that response. In this lab, I set out to solve that problem using Azure API Management (APIM). The goal? To ensure that only the user who created a response can retrieve or add to it, even if someone else has the response ID. This is especially important in scenarios where AI-generated content may include sensitive or proprietary information. The Problem: Response IDs Aren’t Enough The default behavior of the Responses API is simple: if you have a response ID, you can fetch the response. That’s convenient, but it’s also risky. There’s no built-in check to verify who is making the request. The Responses API is designed to be stateful, combining capabilities from chat completions and assistants into a unified experience. It’s powerful—but without additional safeguards, it can expose sensitive content to unintended users. This lab introduces a way to wrap the Responses API with APIM policies that enforce user-level access control. It’s a lightweight but powerful approach to securing AI-generated content. The Solution: APIM as a Gatekeeper Here’s how it works: A user sends a request to retrieve or update a response. APIM intercepts the request and extracts the user ID—either from the authentication token or, for testing purposes, from a custom header. APIM compares the user ID with the one associated with the response. If they match, the request proceeds. If not, it’s blocked. This ensures that only the original creator of a response can access or modify it. What’s in the Lab The lab in the AI Gateway repo includes: A sample API that mimics AI-generated responses. APIM policies that enforce user-level access. A test harness that lets you simulate requests with different user IDs. Header-based user ID injection for easier testing (ideal for labs and demos). This setup gives you a repeatable pattern for securing AI responses in production environments. Sample APIM Policy Snippet Here’s a simplified version of the APIM inbound policy that enforces user-level access: This policy checks the x-user-id header against the stored owner ID of the response. If they don’t match, the request is blocked with a 403 error. In a production scenario, you would want to use something other than just a userid in the header, for this I might suggest the userid from the authentication token. Why This Matters As AI becomes more embedded in our apps, we need to think beyond just securing the model—we need to secure the responses too. This lab shows how APIM can be used to: Enforce ownership of AI-generated content. Prevent unauthorized access to sensitive responses. Build trust into your AI workflows. Final Thoughts This lab is a great starting point for anyone building AI APIs in a multi-user environment. It’s simple, effective, and leverages tools you already know—like APIM. If you’re interested in extending this to token validation, role-based access, or integrating with Entra ID, let’s talk. I’d love to hear how you’re securing your AI stack.