prompt injection

1 Topic

Building the Solution Teams Need to Secure AI Against Prompt Injection
As artificial intelligence continues to evolve, teams are prioritising rapid advancements and deployment of applications while often overlooking security considerations. Emerging threats such as prompt injection remain poorly understood, and this is putting systems, users, and infrastructure at serious risk. Much of the expertise required to mitigate these risks is currently fragmented and inaccessible, concentrated among a small group of cybersecurity specialists. Meanwhile, developers, under pressure to ship quickly, often lack both the tools and frameworks needed to systematically test their AI systems for vulnerabilities. This disconnect is creating a significant gap between the development and security assurance of AI applications. To address this gap, we developed a unified Prompt Injection Testing Platform and knowledge base, powered by Microsoft Foundry, designed to make LLM security testing accessible, structured, and understandable for developers. Project Overview Developers are rapidly integrating LLMs and agents into applications, but: Security testing is not standardised Prompt injection risks are increasingly understood in research, but poorly mitigated in practice by developers There is a lack of accessible, actionable tooling This creates a dangerous gap: applications are being deployed faster than they are being secured. As part of our UCL Industry Exchange Network (IXN) project in collaboration with Avanade, we built a Prompt Injection Testing Platform designed to solve this exact issue by: Providing a knowledge base of vulnerabilities and mitigations Helping teams identify vulnerabilities within their AI systems Enabling custom and automated testing pipelines Integrating tools like Garak for adversarial testing With this, we aim to make prompt injection testing accessible, standard, and understandable. Project Journey We divided our project into several phases: Phase 1: Understanding our Users’ Needs. We began by identifying the core users of our platform: AI developers and broader stakeholders across development, security, and safety disciplines integrating LLMs into their applications. By meeting with them, we uncovered a few key challenges: Developers have limited awareness of prompt injection risks There is a generalised lack of accessible tools for testing This first exploration set a core principle: We must build a developer-first solution which does not depend on extensive technical knowledge to be used. We concluded that to be as useful as possible, our solution should not require prior prompt injection knowledge. In order to solve the two challenges presented by our users, we concluded a platform would be the best approach, as it enables us to centralise fragmented knowledge while providing a structured, scalable environment for testing LLM vulnerabilities in practice. Phase 2: Understanding the Threat Landscape Building on our user research, we focused on developing a deep understanding of the prompt injection threat landscape to inform the design of our platform. This phase involved researching: Different types of prompt injection vulnerabilities Common attack scenarios and override techniques Existing mitigation strategies used in practice Tools and methodologies for prompt injection security testing The most widely used models to ensure our platform would be compatible with real-world systems. We consolidated these findings into a structured technical report, designed to be shared with developers, security testers, and semi-technical stakeholders. The goal was not only to guide our own implementation, but also to contribute to making prompt injection more standard and understandable. From our research, we realised prompt injection is not a single vulnerability, but a rapidly evolving attack surface that requires continuous, scalable testing rather than one-time validation. Phase 3: Building the Platform Guided by both our user insights and the threat landscape analysis, we moved to designing and developing a unified prompt injection testing platform and knowledge base. To do this, we defined three core principles: Developer first: no deep security knowledge would be required Unified: combines education (knowledge base) and execution (testing tools) Scalable: Expert users could extend the platform by bringing their own models, tests, and mitigations. During this stage, we built a platform which allows teams to: Connect their own LLM endpoints Run custom prompt injection tests Execute automated adversarial testing through Garak Access a centralised knowledge base of vulnerabilities and mitigation strategies. Export knowledge base information and test results as PDFs. By the end, we had developed a unified platform that enables developers to systematically test, understand, and mitigate prompt injection vulnerabilities in their AI applications. To understand how our platform works in practice, you can view our demo video. Platform home interface presenting an overview of prompt injection concepts and a structured vulnerability catalogue for exploring attack types and mitigation strategies. Key Features Model Integration and Configuration Users can use models included in the platform or connect their own LLM endpoints, allowing the platform to work across different providers: Supports multiple model providers through Microsoft Foundry Supports custom model integration via HTTP endpoints Enables model configurations such as custom system prompts and mitigation layers. Ensures flexibility as new models and mitigations emerge Testing Suite The platform allows users to create and run custom prompt injection tests tailored to their applications. This involves: Creating and executing targeted prompts Simulating real-world attack scenarios Running predefined adversarial testing suites (integrating NVIDIA Garak) Testing interface showing configuration of prompt injection tests and execution of automated scans, with results and risk evaluation displayed. Knowledge Base A core component of our platform is a structured knowledge base, which is designed to make prompt injection concepts accessible and understandable. This is divided into two key areas: Vulnerabilities: Provides information on different types of prompt injection attacks, including explanations of how each vulnerability works, with real-world examples and scenarios, as well as references to reputable external sources Mitigations: Focuses on how to defend against these vulnerabilities, and it includes clear implementation strategies and code examples demonstrating how to integrate each mitigation. To support exploration, we also included a chatbot interface, which answers questions using knowledge base data and trusted sources. This helps users quickly navigate vulnerabilities and mitigation strategies by providing contextual, reliable information and redirecting users to the appropriate page of our platform. Figure 3: Direct prompt injection analysis view, where users can explore attack techniques, observe unsafe model responses, and review corresponding mitigation approaches. Prompt Enhancer In addition to testing and learning, our platform integrates a prompt enhancer, designed to help users actively improve the security of their system prompts. It works in the following way: Takes an existing prompt as input Draws on the knowledge base insights and best practices Restructures the prompt to improve clarity and robustness Incorporates selected prompt-layer mitigations to reduce prompt injection risk Prompt Enhancer interface showing the application of prompt-layer mitigations (e.g. delimiter tokens, instruction hierarchy enforcement) to restructure and secure a system prompt against prompt injection attacks. Technical Details To support a flexible and scalable testing system, we designed our platform with a modular, layered architecture. This allows different components to operate independently while remaining integrated through clearly defined interfaces, ensuring both extensibility and maintainability. System Architecture We divided our platform into four main layers: Frontend Layer An interactive user interface that allows developers to: Explore the prompt injection knowledge base Configure and run tests View results and vulnerability analysis API Layer The API layer acts as the orchestration and communication layer between the frontend and the core system. Handles requests from the frontend to create and run tests. Provides frontend with available models, mitigations, and configurations. Ensures any newly added models and mitigations can be automatically reflected in the frontend without requiring manual updates. Domain Layer The layer which defines the core structure and logic of the system: Defines interfaces for key components such as mitigations, models, and test runners Establishes the test structure and data models Encapsulates logic to ensure consistency Integration Layer The layer which implements the abstractions defined in the domain layer and connects the platform to external services Implements model providers such as OpenAI, Anthropic, and other external HTTP-based endpoints Implements test runners, including custom prompt runners and external tools such as Garak. Implements database connections and repository classes. Results and Outcomes Through the research and development of our platform, we were able to gain several key insights into the behaviour and security of LLM-based applications: Prompt injection vulnerabilities are more prevalent than expected. Even simple prompts with carefully crafted inputs can unsafely manipulate a model’s behaviour. Lack of structured testing leads to hidden risks. Without a systematic approach, many vulnerabilities remain undetected. It is sometimes time consuming to manually craft unsafe prompts. Combining custom testing with framework-based testing improves coverage. Using both custom prompts (targeted and application-specific scenarios) and framework-driven testing (e.g. Garak) enables a more comprehensive evaluation of model safety, as both expected and unexpected vulnerabilities can be captured Structured prompts can significantly improve robustness. We observed that prompts with a clear structure and embedded mitigations are less susceptible to injection attacks. By the end of our project, we successfully developed a platform that: Bridges the gap between prompt injection knowledge and practical testing. Enables repeatable and structured testing of prompt injection vulnerabilities Provides a unified workflow for learning, testing, and improving prompt security. Supports multiple models and testing approaches, to cover the entire vulnerability safety. We demonstrated that prompt injection risks can be systematically identified, tested, and mitigated through a structured and repeatable approach. Lessons Learned Throughout the project, we identified several key insights that shaped both our technical approach and our understanding of AI security. AI is rapidly evolving, and systems must be designed accordingly. AI models and attack techniques are advancing extremely fast. As a result, static solutions are quickly becoming obsolete. We learned that it is essential to design a platform that is modular, extensible and adaptable. Through well-defined interfaces and generic services, we ensured our platform can evolve alongside attacks and mitigations. Security must be built into development, not considered at testing. Many developers are focusing on functionality first and security often takes a backseat. In the context of LLMs, vulnerabilities can fundamentally affect the security of the system and its users. As such, security should be treated as a core part of the development cycle. Models and external tools should only be connected if their safety is guaranteed. Bridging the gap between developers and security testers is necessary. We identified a major disconnect between developers building AI applications and the security testers evaluating them. These groups often operate with different priorities and levels of knowledge. We are bridging this gap by making prompt injection knowledge more accessible and creating workflows that are usable by developers while still grounded in robust security practices. Further Development While our platform provides a strong foundation for prompt injection testing and knowledge, there are several areas for future exploration: Expanding our testing framework integrations, by adding a broader coverage of attack techniques Integration with MCP servers and external systems, supporting interactions with tools, APIs and external data sources. Addressing additional indirect prompt injection vulnerabilities, including file uploads, website scraping, and multi-step workflows. Looking ahead, we also aim to integrate our platform more deeply into development workflows by introducing CI/CD integrations for continuous security testing and versioned tracking of model robustness over time. Our goal is to evolve the platform into a comprehensive security layer, capable of testing entire AI-driven systems in dynamic, real-world contexts. Conclusion As AI becomes increasingly integrated into real-world applications, ensuring their security is essential. As our research highlights, current practices have not kept pace with the rapid evolution of AI systems and attack techniques. Through our work, we demonstrated that prompt injection risks can be systematically identified, tested, and mitigated using a structured approach. By combining a unified knowledge base with a flexible testing platform powered by Microsoft Foundry, we are taking a step towards making AI systems safer and more reliable. More importantly, our project reinforces a broader idea: a developer-first approach to security, supported by collaboration across development, security, and safety disciplines, is essential for building AI at scale. Security should not remain confined to specialist teams but should be embedded directly into the development process, alongside practices such as red-teaming and continuous testing. Our project empowers teams with the knowledge and tools they need to build safer and more reliable AI systems. If you’re interested in building more secure AI systems or exploring prompt injection in practice, we invite you to join us through the Foundry Community on the 3rd of June at 2pm BST, when we will be showcasing our platform live, walking through real-world examples, and discussing how teams can integrate prompt injection testing into their development workflows. Team Teo Montero Bonet, UCL Computer Science Mario Mojarro Ruiz, UCL Computer Science David Thomas Garcia, UCL Computer Science Nathaniel Gibbon, UCL Computer Science With support from Josh McDonald, Avanade
teo-montero
May 13, 2026 Place Educator Developer Blog
51Views
0likes
0Comments