Microsoft Defender for Cloud Blog

22 MIN READ

The Microsoft Defender for AI Alerts

Microsoft

Dec 03, 2025

Microsoft Defender for AI Alerts (explanation, alignment with MITRE ATT&CK®, and mitigation plan)

I will start this blog post by thanking my Secure AI GBB Colleague Hiten Sharma for his contributions to this Tech Blog as a peer-reviewer.

Microsoft Defender for AI (part of Microsoft Defender) helps organizations threats to generative AI applications in real time and helps respond to security issues.

Microsoft Defender for AI is in General Availability state and covers Azure OpenAI supported models and Azure AI Model Inference service supported models deployed on Azure Commercial Cloud and provides Activity monitoring and prompt evidence for security teams.

This blog aims to help the Microsoft Defender for AI (the service) users understand the different alerts generated by the service, what they mean, how they align to the Mitre Att&ck Framework, and how to reduce the potential alert re-occurrences.

The 5 Generative AI Security Threats You Need to Know About

This section aims to give the reader an overview of the 5 Generative AI Security Threats every security professional needs to know about. For more details, please refer to “The 5 generative AI security threats you need to know about e-book”.

Poisoning Attacks

Poisoning attacks are adversarial attacks which target the training or fine-tuning data of generative AI models.

In a Poisoning Attack, the adversary injects biased or malicious data during the learning process with the intention of affecting the model’s behavior, accuracy, reliability, and ethical boundaries.

Evasion Attacks

Evasion Attacks are adversarial attacks where the adversary crafts inputs designed to bypass the security controls and model restrictions. This kind of attacks [Evasion Attacks] exploit the generative AI system in the model’s inference stage (In the context of generative AI, this is the stage where the model generates text, images, or other outputs in response to user inputs.).

In an Evasion Attack, the adversary does not modify the Generative AI model itself but rather adapts and manipulates prompts to avoid the model safety mechanisms.

Functional Extraction

Functional Extraction attacks are model extraction attacks where the adversary repeatedly interacts with the Generative AI system and observes the responses.

In a Functional Extraction attack, the adversary attempts to reverse-engineer or recreate the generative AI system without direct access to its infrastructure or training data.

Inversion Attack

Inversion Attacks are adversarial attacks where the adversary repeatedly interacts with the Generative AI system to reconstruct or infer sensitive information about the model and its infrastructure.

In an Inversion Attack, the adversary attempts to exploit what the Generative AI model have memorized from its training data.

Prompt Injection Attacks

Prompt Injection Attacks are evasion attacks where the adversary uses malicious prompts to override or bypass the AI system’s safety rules, policies, and intended behavior.

In a Prompt Injection Attack, the adversary embeds malicious instructions in a prompt (or a sequence of prompts) to trick the AI system into ignoring safety filters, generate harmful or restricted contents, or reveal confidential information (i.e. the Do Anything Now (DAN) exploit, which prompts LLMs to “do anything now.” More details about AI Jail Brake attempts, including DAN exploit can be found in this Microsoft Tech Blog Article).

The Microsoft Defender for AI Alerts

Microsoft Defender for AI works with Azure AI Prompt Shields (more details at Microsoft Foundry Prompt Shields documentation) and utilizes Microsoft’s Threat Intelligence to identify (in real-time) the threats impacting the monitored AI Services.

Below is a list of the different alerts Defender for AI generates, what they mean, how they align with the Mitre Att&ck Framework, and suggestion on how to reduce the potential of their re-occurrence. More details about these alerts can be found at Microsoft Defender for AI documentation.

Detected credential theft attempts on an Azure AI model deployment

Severity: Medium

Mitre Tactics: Credential Access, Lateral Movement, Exfiltration

Attack Type: Inversion Attack

Description: As per Microsoft Documentation “The credential theft alert is designed to notify the SOC when credentials are detected within GenAI model responses to a user prompt, indicating a potential breach. This alert is crucial for detecting cases of credential leak or theft, which are unique to generative AI and can have severe consequences if successful.”

How it happens: Credential Leakage in a Generative AI response typically occur because of training the model with data that contains credentials (i.e. Hardcoded secrets, API Keys, passwords, or configuration files that contain such information), this can also occur if the prompt triggers the AI System to retrieve the information from host system tools or memory.

How to avoid: The re-occurrence of this alert can be reduced by adapting the following:

Training Data Hygiene: Ensure that no credentials exist in the training data, this can be done by scanning for credentials and using secret-detection tools before training or fine-tuning the model(s) in use.
Guardrails and Filtering: Implementing output scanning (i.e. credential detectors, filters, etc…) to block responses that contain credentials. This can be addressed using various methods including custom content filters in Azure AI Foundry.
Adapt Zero Trust, including least privilege access to the run-time environment for the AI system, ensure that the AI System and its plugins has no access to secrets (more details at Microsoft’s Zero Trust site.)
Prompt Injection Defense: In addition to adapting the earlier recommendations, also use Azure AI Prompt Shields to identify and potentially block prompt injection attempts.

A Jailbreak attempt on an Azure AI model deployment was blocked by Azure AI Content Safety Prompt Shields

Severity: Medium

Mitre Tactics: Privilege Escalation, Defense Evasion

Attack Type: Prompt Injection Attack

Description: As per Microsoft Documentation “The Jailbreak alert, carried out using a direct prompt injection technique, is designed to notify the SOC there was an attempt to manipulate the system prompt to bypass the generative AI’s safeguards, potentially accessing sensitive data or privileged functions. It indicated that such attempts were blocked by Azure Responsible AI Content Safety (also known as Prompt Shields), ensuring the integrity of the AI resources and the data security.”

How it happens: This alert indicates that Prompt Shields (more details about prompt shields at Microsoft Foundry Prompt Shields documentation) have identified an attempt by an adversary to use a specially engineered input to trick the AI System into by passing its safety rules, guardrails, or content filters. In the case of this alert, Prompt Shields have detected and blocked the attempt, preventing the AI system from acting differently than its guardrails.

How to avoid: While this alert indicates that Prompt Shield has successfully blocked the Jailbreak attempt, additional measures can be taken to reduce the potential impact and re-occurrence of Jailbreak attempts:

Use Azure AI Prompt Shields: Real-Time detection is not a single use but rather a continuous use security measure. Continue using it and monitor alerts, (more details at Microsoft Foundry Prompt Shields documentation).
Use Retrieval Isolation: Retrieval Isolation separates user prompts from knowledge/retrieval sources (i.e. Knowledge Bases, Databases, Web search Agents, APIs, Documents), this isolation ensures that the model is not directly influencing what contents is retrieved, insures that malicious prompts cannot poison the knowledge/retrieval sources, and reduces the impact of malicious prompts that intend to coerce the system to retrieve sensitive or unsafe data.
Continuous testing: Using Red Teaming tools (i.e. Microsoft AI Read Team tools) and exercises, continuously test the AI system against Jail Break patterns and models and adjust security measures according to findings.
Adapt Zero Trust, including enforcing strong authentication and authorization measures where you verify explicitly, use least privilege access, and always assume breach to ensure the AI system cannot directly trigger actions, API calls, or sensitive operations without proper validation (more details at Microsoft’s Zero Trust site.

A Jailbreak attempt on an Azure AI model deployment was detected by Azure AI Content Safety Prompt Shields

Severity: Medium

Mitre Tactics: Privilege Escalation, Defense Evasion

Attack Type: Prompt Injection Attack

Description: As per Microsoft Documentation “The Jailbreak alert, carried out using a direct prompt injection technique, is designed to notify the SOC there was an attempt to manipulate the system prompt to bypass the generative AI’s safeguards, potentially accessing sensitive data or privileged functions. It indicated that such attempts were detected by Azure Responsible AI Content Safety (also known as Prompt Shields), but weren't blocked due to content filtering settings or due to low confidence.”

How it happens: This alert indicates that Prompt Shields have identified an attempt by an adversary to use a specially engineered input to trick the AI System into by passing its safety rules, guardrails, or content filters. In the case of this alert, Prompt Shields have detected the attempt but did not block it, the event is not blocked due to either the content filter settings configuration or low confidence.

How to avoid: While this alert indicates that Prompt Shield is enabled to protect the AI system and has successfully detected the Jailbreak attempt, additional measures can be taken to reduce the potential impact and re-occurrence of Jailbreak attempts:

Use Azure AI Prompt Shields: Real-Time detection is not a single use but rather a continuous use security measure. Continue using it and monitor alerts, (more details at Microsoft Foundry Prompt Shields documentation).
Use Retrieval Isolation: Retrieval Isolation separates user prompts from knowledge/retrieval sources (i.e. Knowledge Bases, Databases, Web search Agents, APIs, Documents), this isolation ensures that the model is not directly influencing what contents is retrieved, insures that malicious prompts cannot poison the knowledge/retrieval sources, and reduces the impact of malicious prompts that intend to coerce the system to retrieve sensitive or unsafe data.
Continuous testing: Using Red Teaming tools (i.e. Microsoft AI Read Team tools) and exercises, continuously test the AI system against Jail Break patterns and models and adjust security measures according to findings.
Adapt Zero Trust, including enforcing strong authentication and authorization measures where you verify explicitly, use least privilege access, and always assume breach to ensure the AI system cannot directly trigger actions, API calls, or sensitive operations without proper validation (more details at Microsoft’s Zero Trust site.

Corrupted AI application\model\data directed a phishing attempt at a user

Severity: High

Mitre Tactics: Impact (Defacement)

Attack Type: Poisoning Attack

Description: As per Microsoft Documentation “This alert indicates a corruption of an AI application developed by the organization, as it has actively shared a known malicious URL used for phishing with a user. The URL originated within the application itself, the AI model, or the data the application can access.”

How it happens: This alert indicates the AI system, its underlying model, or its knowledge sources were corrupted with malicious data and started returning the corrupted data in the form of phishing-style responses to the users. This can occur because of training data poisoning, an earlier successful attack that modified the system knowledge sources, tampered system instructions, or unauthorized access to the AI system itself.

How to avoid: This alert needs to be taken seriously and investigated accordingly, the re-occurrence of this alert can be reduced by adapting the following:

Strengthen model and data integrity controls, this includes hashing model artifacts (i.e. Model Weights, Tokenizers), signing model packages, and enforcing integrity checks during runtime.
Adapt Zero Trust, including enforcing strong authentication and authorization measures where you verify explicitly, across developer environments, CI/CD pipelines, knowledge sources, and deployment endpoints, (more details at Microsoft’s Zero Trust site.)
Implement data validation and data poisoning detection strategies on all incoming training and fine-tuning data.
Use Retrieval Isolation: Retrieval Isolation separates user prompts from knowledge/retrieval sources (i.e. Knowledge Bases, Databases, Web search Agents, APIs, Documents), this isolation ensures that the model is not directly influencing what contents is retrieved, insures that malicious prompts cannot poison the knowledge/retrieval sources, and reduces the impact of malicious prompts that intend to coerce the system to retrieve sensitive or unsafe data.
Continuous testing: Using Red Teaming tools (i.e. Microsoft AI Read Team tools) and exercises, continuously test the AI system against poisoning attempts, prompt injection attacks, and malicious tools invocation scenarios.

Phishing URL shared in an AI application

Severity: High

Mitre Tactics: Impact (Defacement), Collection

Attack Type: Prompt Injection

Description: As per Microsoft Documentation “This alert indicates a potential corruption of an AI application, or a phishing attempt by one of the end users. The alert determines that a malicious URL used for phishing was passed during a conversation through the AI application, however the origin of the URL (user or application) is unclear.”

How it happens: This alert indicates that a phishing URL was present in the interaction between the user and the AI System, this phishing URL might originate from a user prompt, as a result of malicious input in a prompt, generated by them model as a result of an earlier attack, or due to a poisoned knowledge source.

How to avoid: This alert needs to be taken seriously and investigated accordingly, the re-occurrence of this alert can be reduced by adapting the following:

Adapt URL scanning mechanism prior to returning any URL to users (i.e. check against Threat Intelligence, URL reputation sources) and content scanning mechanisms (This can be done using Azure Prompt Flows, or using Azure Functions).
Use Retrieval Isolation: Retrieval Isolation separates user prompts from knowledge/retrieval sources (i.e. Knowledge Bases, Databases, Web search Agents, APIs, Documents), this isolation ensures that the model is not directly influencing what contents is retrieved, insures that malicious prompts cannot poison the knowledge/retrieval sources, and reduces the impact of malicious prompts that intend to coerce the system to retrieve sensitive or unsafe data.
Filter and sanitize user prompts to prevent harmful or malicious URLs from being used or amplified by the AI system (This can be done using Azure Prompt Flows, or using Azure Functions).

Phishing attempt detected in an AI application

Severity: High

Mitre Tactics: Collection

Attack Type: Prompt Injection, Poisoning Attack

Description: As per Microsoft Documentation “This alert indicates a URL used for phishing attack was sent by a user to an AI application. The content typically lures visitors into entering their corporate credentials or financial information into a legitimate looking website. Sending this to an AI application might be for the purpose of corrupting it, poisoning the data sources it has access to, or gaining access to employees or other customers via the application's tools.”

How it happens: This alert indicates that a phishing URL was present in a prompt sent from the user to the AI System. When a user uses a phishing URL in a prompt, this can be an indicator of a user who is attempting to corrupt the AI system, corrupt its knowledge sources to compromise other users of the AI system, or a user who is trying to manipulate the AI system to use stored data, stored credentials or system tools in the phishing URL.

How to avoid: This alert needs to be taken seriously and investigated accordingly, the re-occurrence of this alert can be reduced by adapting the following:

Filter and sanitize user prompts to prevent harmful or malicious URLs from being used or amplified by the AI system (This can be done using Azure Prompt Flows, or using Azure Functions).
Use Retrieval Isolation: Retrieval Isolation separates user prompts from knowledge/retrieval sources (i.e. Knowledge Bases, Databases, Web search Agents, APIs, Documents), this isolation ensures that the model is not directly influencing what contents is retrieved, insures that malicious prompts cannot poison the knowledge/retrieval sources, and reduces the impact of malicious prompts that intend to coerce the system to retrieve sensitive or unsafe data.
Monitor anomalous behavior originating from the sources that have common connection characteristics.

Suspicious user agent detected

Severity: Medium

Mitre Tactics: Execution, Reconnaissance, Initial access

Attack Type: Multiple

Description: As per Microsoft Documentation “The user agent of a request accessing one of your Azure AI resources contained anomalous values indicative of an attempt to abuse or manipulate the resource. The suspicious user agent in question has been mapped by Microsoft threat intelligence as suspected of malicious intent and hence your resources were likely compromised.”

How it happens: This alert indicates that a user agent of a request that is accessing one of your Azure AI resources contains values that were mapped by Microsoft Threat Intelligence as suspected of Malicious intent. When this alert is present, it is indicative of an abuse or manipulation attempt. This does not necessarily mean that your AI System has been breached, however its an indication that an attack is being attempted and underway, or the AI system was already compromised.

How to avoid: Indicators from this alert need to reviewed, including other alerts that might help formulate a full understanding of the sequence of events taking place. Impact and re-occurrence of this alert can be reduced by adapting the following:

Review impacted AI systems to assess impact of the event on these systems.
Adapt Zero Trust, including enforcing strong authentication and authorization measures where you verify explicitly, use least privilege access, and always assume breach (more details at Microsoft’s Zero Trust site.)
Applying rate limiting and bot detection measures using services like Azure Management Gateway.
Apply comprehensive user agent filtering and restriction measures to protect your AI System from suspicious or malicious clients by enforcing user-agent filtering at the edge (i.e. using Azure Front door), the gateway (i.e. using Azure API Management), and identity layers to ensure only trusted, verified applications and devices can access your GenAI endpoints.
Enable Network Protection Measures (i.e. WAF, Reputation Filters, Geo Restrictions) to filter out traffic from IP addresses associated with Malicious actors and their infrastructure, to avoid traffic from geographies and locations known to be associated with malicious actors, and to eliminate traffic with other highly suspicious characteristics. This can be done using services like Azure Front Door, or Azure Web Application Firewall.

ASCII Smuggling prompt injection detected

Severity: Medium

Mitre Tactics: Execution, Reconnaissance, Initial access

Attack Type: Evasion Attack, Prompt Injection

Description: As per Microsoft Documentation “ASCII smuggling technique allows an attacker to send invisible instructions to an AI model. These attacks are commonly attributed to indirect prompt injections, where the malicious threat actor is passing hidden instructions to bypass the application and model guardrails. These attacks are usually applied without the user's knowledge given their lack of visibility in the text and can compromise the application tools or connected data sets.”

How it happens: This alert indicates an AI system has received a request that a attempted to circumvent system guardrails by embedding harmful instructions by using ASCII characters commonly used for prompt injection attacks. This alert can be caused by multiple reasons including: a malicious user who is attempting prompt manipulation, by an innocent user who is pasting a prompt that contains malicious hidden ASCII characters or instructions, or a knowledge source connected to the AI System that is adding the malicious ASCII characters to the user prompt.

How to avoid: Indicators from this alert should be reviewed, including other alerts that might help formulate a full understanding of the sequence of events taking place. Impact and re-occurrence of this alert can be reduced by adapting the following:

If the user involved in the incident is known, review their access grant (in Microsoft Entra), and ensure their device and accounts are not compromised starting with reviewing incidents and evidences in Microsoft Defender.
Normalize user input before sending it to the models of the AI system, this can be performed using a pre-processing ( i.e. using Azure Prompt Flows, or using Azure Functions, or using Azure API Management).
Strip (or block) suspicious ASCII patterns and hidden characters using a pre-processing layer (i.e. using Azure Prompt Flows, or using Azure Functions).
Use retrieval isolation to prevent smuggled ASCII from propagating to knowledge sources and tools, multiple retrieval isolation strategies can be adapted including separating user’s raw-input from system-safe input and utilizing the system-safe inputs as bases to build queries and populate fields (i.e. arguments) to invoke tools the AI System interacts with.
Using Red Teaming tools (i.e. Microsoft AI Read Team tools) and exercises, continuously test the AI system against ASCII smuggling attempts.

Access from a Tor IP

Severity: High

Mitre Tactics: Execution

Attack Type: Multiple

Description: As per Microsoft Documentation “An IP address from the Tor network accessed one of the AI resources. Tor is a network that allows people to access the Internet while keeping their real IP hidden. Though there are legitimate uses, it is frequently used by attackers to hide their identity when they target people's systems online.”

How it happens: This alert indicates that a user attempted to access the AI System using a TOR exit node. This can be an indicator of a malicious user attempting to hide the true origin of there connection source, whether to avoid geo fencing, or to conceal their identity while carrying on an attack against the AI system.

How to avoid: Impact and re-occurrence of this alert can be reduced by adapting the following:

Adapt Zero Trust, including enforcing strong authentication and authorization measures where you verify explicitly, use least privilege access, and always assume breach (more details at Microsoft’s Zero Trust site.)
Enable Network Protection Measures (i.e. WAF, Reputation Filters, Geo Restrictions) to prevent traffic from TOR exit nodes from reaching the AI System. This can be done using services like Azure Front Door, or Azure Web Application Firewall.

Access from a suspicious IP

Severity: High

Mitre Tactics: Execution

Attack Type: Multiple

Description: As per Microsoft Documentation “An IP address accessing one of your AI services was identified by Microsoft Threat Intelligence as having a high probability of being a threat. While observing malicious Internet traffic, this IP came up as involved in attacking other online targets.”

How it happens: This alert indicates that a user attempted to access the AI System from an IP address that was identified by Microsoft Threat Intelligence as suspicious. This can be an indicator of a malicious user or a malicious tool carrying on an attack against the AI system.

How to avoid: Impact and re-occurrence of this alert can be reduced by adapting the following:

Adapt Zero Trust, including enforcing strong authentication and authorization measures where you verify explicitly, use least privilege access, and always assume breach (more details at Microsoft’s Zero Trust site.)
Enable Network Protection Measures (i.e. WAF, Reputation Filters, Geo Restrictions) to prevent traffic from suspicious IP addresses from reaching the AI System. This can be done using services like Azure Front Door, or Azure Web Application Firewall.

Suspected wallet attack - recurring requests

Severity: Medium

Mitre Tactics: Impact

Attack Type: Wallet Attack

Description: As per Microsoft Documentation “Wallet attacks are a family of attacks common for AI resources that consist of threat actors excessively engage with an AI resource directly or through an application in hopes of causing the organization large financial damages. This detection tracks high volumes of identical requests targeting the same AI resource which may be caused due to an ongoing attack.”

How it happens: Wallet attacks are a category of attacks that attempt to exploit the usage-based billing, quota limits, or token-consumption of the AI System to inflect financial or operational harm on the AI system. This alert is an indicator of the AI System receiving repeated, or high-frequency, or patterned requests that are consistent with wallet attack attempts.

How to avoid: Impact and re-occurrence of this alert can be reduced by adapting the following:

Adapt Zero Trust, including enforcing strong authentication and authorization measures where you verify explicitly, use least privilege access, and always assume breach (more details at Microsoft’s Zero Trust site.)
Enable Network Protection Measures (i.e. WAF, Reputation Filters, Geo Restrictions) to prevent traffic from known malicious actors known IP addresses and infrastructure from reaching the AI System. This can be done using services like Azure Front Door, or Azure Web Application Firewall.
Apply rate-limiting and throttling to connection attempts to the AI System using Azure API Management.
Enable Quotas, strict usage caps, and cost guardrails using Azure API Management, using Azure Foundry Limits and Quotas, and Azure cost management.
Implement client-side security measures (i.e. tokens, signed requests) to prevent bots from imitating legitimate users. There are multiple approaches to adapt (collectively) to achieve this, for example by using Entra ID Tokens for authentication instead of using a simple API key from the front end.

Suspected wallet attack - volume anomaly

Severity: Medium

Mitre Tactics: Impact

Attack Type: Wallet Attack

Description: As per Microsoft Documentation “Wallet attacks are a family of attacks common for AI resources that consist of threat actors excessively engage with an AI resource directly or through an application in hopes of causing the organization large financial damages. This detection tracks high volumes of requests and responses by the resource that are inconsistent with its historical usage patterns.”

How it happens: Wallet attacks are a category of attacks that attempt to exploit the usage-based billing, quota limits, or token-consumption of the AI System to inflect financial or operational harm on the AI system. This alert is an indicator of the AI system experiencing an abnormal volume of interactions exceeding normal usage patterns, which can be caused by automated scripts, bots, or coordinated efforts that are attempting to impose financial and / or operational harm on the AI System.

How to avoid: Impact and re-occurrence of this alert can be reduced by adapting the following:

Adapt Zero Trust, including enforcing strong authentication and authorization measures where you verify explicitly, use least privilege access, and always assume breach (more details at Microsoft’s Zero Trust site.)
Enable Network Protection Measures (i.e. WAF, Reputation Filters, Geo Restrictions) to prevent traffic from known malicious actors known IP addresses and infrastructure from reaching the AI System. This can be done using services like Azure Front Door, or Azure Web Application Firewall.
Apply rate-limiting and throttling to connection attempts to the AI System using Azure API Management.
Enable Quotas, strict usage caps, and cost guardrails using Azure API Management, using Azure Foundry Limits and Quotas and Azure cost management.
Implement client-side security measures (i.e. tokens, signed requests) to prevent bots from imitating legitimate users. There are multiple approaches to adapt (collectively) to achieve this, for example by using Entra ID Tokens for authentication instead of using a simple API key from the front end.

Access anomaly in AI resource

Severity: Medium

Mitre Tactics: Execution, Reconnaissance, Initial access

Attack Type: Multiple

Description: As per Microsoft Documentation “This alert track anomalies in access patterns to an AI resource. Changes in request parameters by users or applications such as user agents, IP ranges, authentication methods. can indicate a compromised resource that is now being accessed by malicious actors. This alert may trigger when requests are valid if they represent significant changes in the pattern of previous access to a certain resource.”

How it happens: This alert indicates that a shift in connection and interaction patterns was detected compared to the established baseline of connections and interactions with the AI Systems. This alert can be an indicator of probing events or can be an indicator of a compromised AI System that is now being abused by the malicious actor.

How to avoid: Impact and re-occurrence of this alert can be reduced by adapting the following:

Adapt Zero Trust, including enforcing strong authentication and authorization measures where you verify explicitly, use least privilege access, and always assume breach (more details at Microsoft’s Zero Trust site.)
If exposure is suspected, rotate API Keys and Secrets (more details on how to rotate API Keys in Azure Foundry Documentation).
Enable Network Protection Measures (i.e. WAF, Reputation Filters, Geo Restrictions, Conditional Access Controls) to prevent similar traffic from reaching the AI System. Restrictions can be implemented using services like Azure Front Door, or Azure Web Application Firewall.
Apply rate-limiting and anomaly detection measures to block unusual request bursts or abnormal access patterns. Rate limiting can be implemented using Azure API Management, Anomaly detection can be performed using AI Real-Time monitoring tools like Microsoft Defender for AI and Security Operations platforms like Microsoft Sentinel where rules can later be created to trigger automations and playbooks that can update the Azure WAF and APIM to block or rate limit traffic from a certain origin.

Suspicious invocation of a high-risk 'Initial Access' operation by a service principal detected (AI resources)

Severity: Medium

Mitre Tactics: Initial access

Attack Type: Identity-based Initial Access Attack

Description: As per Microsoft Documentation “This alert detects a suspicious invocation of a high-risk operation in your subscription, which might indicate an attempt to access restricted resources. The identified AI-resource related operations are designed to allow administrators to efficiently access their environments. While this activity might be legitimate, a threat actor might utilize such operations to gain initial access to restricted AI resources in your environment. This can indicate that the service principal is compromised and is being used with malicious intent.”

How it happens: This alert indicates that an AI System was involved in a highly privileged operation against the run-time environment of the AI System using legitimate credentials. While this might be an intended behavior (regardless of the validity of this design from a security standpoint), this can also be an indicator of an attack against the AI system where the malicious actor has successfully circumvented the AI System guardrails and influenced the AI System to operate beyond its intended behavior. When performed by a malicious actor, this event is expected to be a part of a multi-stage attack against the AI System.

How to avoid: Impact and re-occurrence of this alert can be reduced by adapting the following:

Upon detection, immediately rotate impacted accounts secrets and certificates.
To ensure the AI system cannot directly trigger actions, API calls, or sensitive operations without proper validation, Adapt Zero Trust, including enforcing strong authentication and authorization measures where you verify explicitly, use least privilege access, and always assume breach (more details at Microsoft’s Zero Trust site.).
As a part of adapting Zero Trust strategy, enforce managed identities usage instead of relying on long-lived credentials, such as Entra managed IDs for Azure.
Use conditional access measures (i.e. Entra conditional access) to limit where and how service principals can authenticate into the system.
Enforce a training data hygiene practice to ensure that no credentials exist in the training data, this can be done by scanning for credentials and using secret-detection tools before training or fine-tuning the model(s) in use.
Use retrieval isolation to prevent similar events from propagating to knowledge sources and tools, multiple retrieval isolation strategies can be adapted including separating user’s raw-input from system-safe input and utilizing the system-safe inputs as bases to build queries and populate fields (i.e. arguments) to invoke tools the AI System interacts with.

Anomalous tool invocation

Severity: Low

Mitre Tactics: Execution

Attack Type: Prompt Injection, Evasion Attack

Description: As per Microsoft Documentation “This alert analyzes anomalous activity from an AI application connected to an Azure OpenAI model deployment. The application attempted to invoke a tool in a manner that deviates from expected behavior. This behavior may indicate potential misuse or an attempted attack through one of the tools available to the application.”

How it happens: This alert indicates that the AI System has invoked a tool or a downstream capability in behavior pattern that deviates from its expected behavior. This event can be an indicator that a malicious user have managed to provide a prompt (or series of prompts) that have circumvented the AI System defenses and guardrails and as a result caused the AI System to call tools it should not call or caused it to use tools it has access to in an abnormal way.

How to avoid: Impact and re-occurrence of this alert can be reduced by adapting the following:

Adapt Zero Trust, including enforcing strong authentication and authorization measures where you verify explicitly, use least privilege access, and always assume breach (more details at Microsoft’s Zero Trust site.)
In addition to Prompt Shields, use input sanitization in the AI System to block malicious prompts and sanitize ASCII smuggling attempts using a pre-processing layer (i.e. using Azure Prompt Flows, or using Azure Functions).
Use retrieval isolation to prevent similar events from propagating to knowledge sources and tools, multiple retrieval isolation strategies can be adapted including separating user’s raw-input from system-safe input and utilizing the system-safe inputs as bases to build queries and populate fields (i.e. arguments) to invoke tools the AI System interacts with.
Implement functional guardrails to separate model reasoning from tool-execution, multiple strategies can be adapted to implement function guardrails including retrieval isolation (discussed earlier) and separating the decision making layer to call a tool from the LLM itself. In this case, the LLM will receive the user prompt (request and context) and will then reason that it need to invoke a specific tool, then the request is sent to an orchestration layer that will validate the request and run policy and safety checks, and then initiates the tool execution.

Blog Post

The Microsoft Defender for AI Alerts

Microsoft Defender for AI Alerts (explanation, alignment with MITRE ATT&CK®, and mitigation plan)

The 5 Generative AI Security Threats You Need to Know About

Poisoning Attacks

Evasion Attacks

Functional Extraction

Inversion Attack

Prompt Injection Attacks

The Microsoft Defender for AI Alerts

Detected credential theft attempts on an Azure AI model deployment

A Jailbreak attempt on an Azure AI model deployment was blocked by Azure AI Content Safety Prompt Shields

A Jailbreak attempt on an Azure AI model deployment was detected by Azure AI Content Safety Prompt Shields

Corrupted AI application\model\data directed a phishing attempt at a user

Phishing URL shared in an AI application

Phishing attempt detected in an AI application

Suspicious user agent detected

ASCII Smuggling prompt injection detected

Access from a Tor IP

Access from a suspicious IP

Suspected wallet attack - recurring requests

Suspected wallet attack - volume anomaly

Access anomaly in AI resource

Suspicious invocation of a high-risk 'Initial Access' operation by a service principal detected (AI resources)

Anomalous tool invocation

Suggested Additional Reading:

1 Comment