Forum Widgets
Latest Discussions
Detecting AI agents and non-human identities in Microsoft Sentinel: the classic-agent blind spot
Build 2026 made the direction official. The industry is moving from the app era into the agent era, and Microsoft spent a real share of the keynote on securing agents across their lifecycle, from discovering what is exploitable to governing what is running in production. On the identity side the centerpiece is Microsoft Entra Agent ID, now generally available, which gives AI agents first-class identities and extends Conditional Access, Identity Protection, and full audit logging to them. That is good news for agents you build the new way. It is not the whole picture, and the gap is where most SOCs will get hurt first. Modern agents are covered. Classic agents are not. Entra Agent ID draws a hard line between two kinds of agent. Modern agents are created through the Agent ID platform, each backed by an agent identity blueprint. They carry a proper Agent ID, a full audit trail, and the complete set of governance capabilities, including Identity Protection for Agents, which establishes a baseline for an agent's normal activity and flags anomalies automatically. Classic agents are everything that came before, or that gets built outside the platform: AI agents implemented as ordinary service principals or app registrations, for example Copilot Studio agents created before Agent ID was enabled, or any home-grown automation calling Graph with client credentials. In the Entra agent registry they appear with "Has Agent ID: No," and that flag matters, because the Agent ID protections apply to identities that actually hold an Agent ID. Classic agents sit outside Identity Protection for Agents and Conditional Access for Agents. Here is the uncomfortable part. The non-human identities you already run, the service principals behind your pipelines, your integrations, your scripts, your pre-platform Copilot Studio bots, are almost all classic agents. They tend to outnumber your human accounts, they have no MFA in any meaningful sense, and a credential added to one does not show up in the Azure portal. The new platform protections do not reach them. Until you migrate them, the only place you get detection coverage on that population is your SIEM. So this is the job Sentinel does that Agent ID does not: detect risky behavior on the classic, service-principal-backed agents that the platform cannot yet protect. The telemetry you have, and the one switch people forget Three tables carry most of the signal. AADServicePrincipalSignInLogs records service principal authentications, the client-credentials sign-ins your agents and automation use. No user, no MFA, just an app proving it holds a secret or certificate. AADManagedIdentitySignInLogs does the same for managed identities. AuditLogs records directory changes, including the one that matters most for persistence: a new credential added to an application or service principal. One practical warning before any of this works. Service principal and managed identity sign-in logs are not streamed by default. You have to enable those categories explicitly in the Entra diagnostic settings feeding your workspace. Plenty of teams write the detection, never check, and never notice the table is empty. Verify that first. Detection 1: a new credential on a service principal or app Adding a secret or certificate to an existing service principal is one of the cleanest persistence techniques in a Microsoft cloud. The attacker compromises a privileged user or app, drops a fresh credential on a service principal that already holds useful Graph permissions, and now has access that survives password resets and session revocation. It maps to MITRE T1098.001, Account Manipulation: Additional Cloud Credentials. For a classic agent it is especially nasty, because there is no Identity Protection baseline watching it. // Detection 1: new secret or certificate added to an application or service principal // MITRE T1098.001 - Account Manipulation: Additional Cloud Credentials AuditLogs | where OperationName has_any ("Add service principal", "Certificates and secrets management") | where Result =~ "success" | extend Initiator = coalesce( tostring(InitiatedBy.user.userPrincipalName), tostring(InitiatedBy.app.displayName)) | extend InitiatorIp = tostring(InitiatedBy.user.ipAddress) | mv-apply Target = TargetResources on ( where Target.type =~ "Application" | extend TargetName = tostring(Target.displayName), TargetId = tostring(Target.id), KeyChanges = Target.modifiedProperties ) | mv-apply Prop = KeyChanges on ( where tostring(Prop.displayName) =~ "KeyDescription" | extend NewKeys = parse_json(tostring(Prop.newValue)), OldKeys = parse_json(tostring(Prop.oldValue)) ) | extend AddedKeys = set_difference(NewKeys, OldKeys) | where array_length(AddedKeys) > 0 | project TimeGenerated, Initiator, InitiatorIp, TargetName, TargetId, AddedKeys | order by TimeGenerated desc The operation filter catches the three shapes this event takes in the log: "Add service principal," "Add service principal credentials," and "Update application - Certificates and secrets management." The modifiedProperties parsing isolates the KeyDescription change, and set_difference confirms a key was actually added rather than removed, so rotating out an old credential does not, on its own, fire the rule. False positives come from legitimate rotation and from automation that provisions app credentials (CI/CD, infrastructure as code). The initiator is the discriminant. A credential added by your deployment pipeline's service account at the usual time is routine. The same change initiated by an interactive admin out of hours, or by an account that never normally touches app credentials, is what you want to surface. Allow-list the expected initiators, not the targets. Detection 2: a classic agent signing in from a first-seen IP A service principal that has only ever authenticated from your Azure regions and suddenly signs in from somewhere new is a strong signal that its credential has been lifted and is being used elsewhere. Service principals have stable, boring network behavior, which makes a first-seen IP a far cleaner indicator for them than it is for roaming human users. This is the behavioral baseline Identity Protection gives you for free on modern agents, rebuilt in KQL for the classic ones it ignores. MITRE T1078.004, Valid Accounts: Cloud Accounts. // Detection 2: classic-agent service principal signing in from a previously unseen IP // MITRE T1078.004 - Valid Accounts: Cloud Accounts let baseline = 14d; let detection = 1d; let KnownIPs = AADServicePrincipalSignInLogs | where TimeGenerated between (ago(baseline + detection) .. ago(detection)) | where tostring(ResultType) == "0" | summarize KnownIPSet = make_set(IPAddress) by AppId; AADServicePrincipalSignInLogs | where TimeGenerated > ago(detection) | where tostring(ResultType) == "0" | lookup kind=leftouter KnownIPs on AppId | where set_has_element(KnownIPSet, IPAddress) == false | summarize FirstSeen = min(TimeGenerated), Resources = make_set(ResourceDisplayName, 10) by ServicePrincipalName, AppId, IPAddress | order by FirstSeen desc The query builds a per-application baseline of source IPs over the previous two weeks, then flags any successful sign-in today from an address outside that set. Two tuning notes. Brand-new service principals have no baseline, so they surface on first use. That is usually worth seeing once, but you can exclude AppIds younger than the baseline window if it gets noisy. And if your agents egress through shifting cloud IP ranges, widen the comparison from an exact IP to the autonomous system number or a known-range allow-list, otherwise you will chase your own infrastructure. This complements Agent ID, it does not replace it! The endgame is not to run these rules forever. It is to shrink the population they apply to. Inventory your tenant for agents marked "Has Agent ID: No," prioritize the ones holding sensitive Graph permissions, and migrate them onto the Agent ID platform, where Identity Protection and Conditional Access take over the baselining you are doing here by hand. Microsoft has signaled a migration path from classic to modern agents. Treat these two detections as the coverage you need in the meantime, and as a permanent safety net for anything that never makes the move. If you do one thing this week: enable the service principal sign-in log category, deploy detection 1, and pull a list of every service principal that had a credential added in the last 90 days. That list alone tends to be more interesting than people expect. Cheers, MarcelMarcel_GraewerJun 09, 2026Brass Contributor131Views0likes0CommentsMSSP migration to Unified portal: how are you sequencing your customer portfolio?
Following the automation and SOAR discussion, I wanted to open a conversation specifically focused on the MSSP and multi-tenant side of the migration, because this is where the coordination challenges are an order of magnitude higher than the technical ones. A few things I am working through before writing this up as Part 5 of the migration series. On Workspace Manager: Microsoft's own documentation now points you away from Workspace Manager at the point of onboarding to the Defender portal, directing you to Microsoft Defender multitenant management instead. For MSSPs who built their operating model around Workspace Manager, this is a significant structural change. For those implementing now, the recommendation is to go straight to the multitenant portal. I am interested in what the transition has looked like in practice for teams who were mid-flight on Workspace Manager when this became clear. On access delegation: one of the more honest framings I want to include in the article is around the GDAP plus Unified RBAC gap. A Microsoft employee confirmed in the RSAC 2026 thread that Unified RBAC support for GDAP in the Defender portal is on the roadmap with no firm date. MSSPs choosing between Entra B2B and the governance relationships model today are making an architectural call that is difficult to reverse. I want to present this accurately, and real experience from practitioners will sharpen that framing. On the connector deployment constraint: you cannot deploy connectors from a managed workspace configured with Azure Lighthouse alone, you also need GDAP. This makes a layered delegation architecture, Lighthouse plus GDAP plus B2B or governance relationships, necessary rather than optional. I am curious whether MSSPs are already running this layered model or whether most are still trying to make Lighthouse work as a single mechanism. On migration sequencing: the question I want to ask specifically is how teams are structuring their customer portfolio migration. Are you running waves based on customer complexity, based on contract renewal timing, based on customer risk appetite, or some other factor? And when something goes wrong in one tenant's migration, how are you containing the impact on the rest of the programme? Sharing the full article once it is written. Happy to discuss anything above in more detail in the thread.AnthonyPorterJun 07, 2026Brass Contributor43Views0likes0CommentsSentinelHealth: Scheduled Rule Retry Logging Does Not Match Docs
## Objective I am working on a health checks architecture for Microsoft Sentinel analytic rules. The goal is to build a set of monitoring queries/approaches that cover rule execution failures, configuration issues (entity mapping, partial success), rule audit tracking, and auto-disabled rule detection. ## My Current Approach So far I have built monitoring for the following areas using the SentinelHealth and SentinelAudit tables: - Scheduled rule window failures (retry exhaustion) - NRT rule execution delays (cumulative delay over 25 minutes) - Partial success and configuration issues (entity mapping drops, alert size limits, semantic errors) with transient error codes filtered out - Auto-disabled rules detection - Rule disable/delete audit tracking via SentinelAudit + AzActivity ## The Issue: Scheduled Rule Retry Logging The documentation at https://learn.microsoft.com/en-us/azure/sentinel/monitor-analytics-rule-integrity#scheduled-rules states that when a scheduled rule fails, it is retried 5 more times on the same window (6 total attempts). It also provides this query to detect completely skipped windows: ```kql _SentinelHealth() | where SentinelResourceType == @"Analytics Rule" | where SentinelResourceKind == "Scheduled" | where Status != "Success" | extend startTime = tostring(ExtendedProperties["QueryStartTimeUTC"]) | summarize failuresByStartTime = count() by startTime, SentinelResourceId | where failuresByStartTime == 6 | summarize count() by SentinelResourceId ``` This query assumes that each retry attempt is logged as a separate event in SentinelHealth, all sharing the same QueryStartTimeUTC. You would then count 6 failure records per startTime to identify a fully skipped window. However, in practice I am seeing different behavior. I ran a diagnostic query with a 90-day lookback (480 non-success events total, 73 unique rules). Every single event had a count of 1 per unique (SentinelResourceName, startTime) combination. No grouping of retries was observed at all. I then found an actual failed-window event that confirms this. Here is the record: - Rule: Port scan detected (ASIM Network Session schema) - Status: Failure - Description: "Rule's scheduled run at 06/01/2026 10:43:55 failed after numerous attempts. It will be re-executed over the next scheduled time." - Issue Code: SemanticErrorInQuery - Only 1 SentinelHealth record exists for this failed window The Description field says "failed after numerous attempts" which indicates the retries happened internally, but only one consolidated Failure event was written to SentinelHealth after all retries were exhausted. The individual retry attempts do not appear as separate records. This means the failuresByStartTime == 6 query from the documentation would never match this pattern, because there is only 1 record per failed window, not 6. ## Why This Matters Yes, completely skipped windows are rare. In my 90-day dataset most failures were permanent types (SemanticErrorInQuery, QueryGeneralError) that would not benefit from retries anyway. But they still happen, and if a tenant experiences a transient issue that causes a higher rate of failed windows, the documented query would silently return nothing. For my health checks I have rewritten the detection to simply look for Status == "Failure" with Description containing "failed after numerous attempts" which matches the actual consolidated event Sentinel writes. ## Questions Is the documented failuresByStartTime == 6 query still accurate? Or has the retry logging behavior changed to write a single consolidated event per failed window? Are there specific failure types or conditions where individual retries are logged as separate events? Perhaps transient failures behave differently from permanent ones in this regard? For anyone else building health monitoring on SentinelHealth - am I missing any important use cases beyond what I described above? Any clarification would be appreciated.SomeZnimavJun 03, 2026Copper Contributor30Views0likes0CommentsSentinel SOAR migration to Unified portal: what broke? anyone evaluated the AI playbook generator?
I want to open a conversation specifically focused on the automation and SOAR side of the migration, because this is the area where problems most commonly surface after onboarding rather than during it. A quick orientation: the Unified portal introduces a specific constraint that catches teams by surprise. Alert-triggered automation for alerts created by Microsoft Defender XDR is not available in the Defender portal. The main use case for alert-triggered automation in this context is responding to alerts from analytics rules where incident creation is disabled. If you had alert-triggered playbooks firing on Defender XDR signals, those need to be re-evaluated against the incident trigger model. This is documented by Microsoft, but it is easy to miss in the volume of migration guidance. The automation failure mode I have seen most consistently: automation rules built around incident title conditions. The Defender XDR correlation engine assigns its own incident names, so any condition keyed to "if incident title contains X" stops matching without throwing an error. The rule is still active, the automation is still enabled, and everything looks fine until someone notices a class of enrichment or response has gone quiet. Microsoft's recommendation is to use Analytic rule name as the condition instead. There is also a firm near-term deadline separate from the March 2027 portal retirement: queries and automation need to be updated by July 1, 2026 for standardised account entity naming. The Name field will consistently hold only the UPN prefix from that date. Any automation comparing AccountName against a full UPN will break. A few specific questions for practitioners: When you onboarded or reviewed your automation post-onboarding, what broke silently versus what produced a visible error? Silent failures are the dangerous ones and sharing specific patterns would be genuinely useful for the community. Has anyone evaluated the new AI playbook generator in the Defender portal? It requires Security Copilot with SCUs available and generates Python-based automation coauthored with Cline in an embedded VS Code environment. Interested in real-world comparisons against existing Logic Apps workflows for the same use case. For those who have migrated alert-triggered playbooks to automation rule invocation: did you find edge cases in the migration, particularly around playbooks used by multiple analytics rules simultaneously? Writing this up as Part 4 of the migration series. Sharing the article link once it is live for anyone who wants the full detail.AnthonyPorterMay 14, 2026Brass Contributor171Views0likes2CommentsXdrLogRaider Defender XDR portal telemetry
A Microsoft Sentinel custom data connector that ingests Microsoft Defender XDR portal-only telemetry — configuration, compliance, drift, exposure, governance — that public Microsoft APIs (Graph Security, Microsoft 365 Defender, MDE) don't expose. https://github.com/akefallonitis/xdrlograider— Defender XDR portal telemetry Happy Hunting 🥳 🎉alkefallonitisMay 08, 2026Copper Contributor88Views0likes2CommentsSentinel Foundry - MCP Server (Preview) (Github Community Release)
I’ve been cooking something that a lot of people in SOC have been struggling with — especially on the engineering side of Microsoft Sentinel. Thanks to the Microsoft Security team for shaping the capabilities of Sentinel even better with Sentinel Data Lake & Modern SecOps. Today’s the day I can finally share it. Note: This is not an official Microsoft product, but it is designed to make the Sentinel Build even better (complement) with much more intelligence. 🚀 Sentinel Foundry is now in public preview with 43 tools. (Sentinel Foundry - MCP Server) It’s an MCP server built to act like the brain of a strong Sentinel engineer — helping make building, improving, and operating Sentinel far more practical, faster, and honestly more enjoyable. For a lot of teams, the challenge is not understanding what Sentinel can do. The hard part is the engineering work around it: -> Deciding what data should actually be ingested -> Building a clean, scalable Sentinel foundation -> Writing useful detections instead of noisy ones -> Balancing security value with cost -> Turning ideas into deployable engineering outputs That is exactly why I built Sentinel Foundry to help communities grow stronger. It helps with the real engineering tasks behind Sentinel — from architecture thinking to detection design, deployment planning, ingestion strategy, automation ideas, and many of the workflows outlined in the GitHub project. How does it work? Here’s one of the flagship prompts I ran with it: “Give me a complete security posture report for our workspace. Score each pillar and tell me what to prioritise.” And within seconds, it produced a structured engineering blueprint that would normally take a lot longer to pull together manually. You can see the example prompts here in what it can do: https://github.com/prabhukiranveesam/Sentinel-Foundry#what-can-it-do I want building Sentinel to feel less like repetitive engineering overhead — and more like real security engineering that is fast, creative, and enjoyable. If you work with Sentinel as a SOC L2 analyst, engineer, detection engineer, consultant, or architect, I’d genuinely love for you to try it and tell me what you think. 🔗 Public Preview: https://github.com/prabhukiranveesam/Sentinel-Foundry This is just the start of an AI era — and I’m excited to keep shaping it with more powerful features over the coming days. This is very easy to set up and will be available to all of you at no cost during this month as part of the public preview, and your feedback is extremely valuable to shape this as a powerful solution.421Views0likes1CommentSentinel RBAC in the Unified portal: who has activated Unified RBAC, and how did it go?
Following the RSAC 2026 announcements last month, I have been working through the full permission picture for the Unified portal and wanted to open a discussion here given how much has shifted in a short period. A quick framing of where things stand. The baseline is still that Azure RBAC carries across for Sentinel SIEM access when you onboard, no changes required. But there are now two significant additions in public preview: Unified RBAC for Sentinel SIEM itself (extending the Defender Unified RBAC model to cover Sentinel directly), and a new Defender-native GDAP model for non-CSP organisations managing delegated access across tenants. The GDAP piece in particular is worth discussing carefully, because I want to be precise about what has and has not changed. The existing limitation from Microsoft's onboarding documentation, that GDAP with Azure Lighthouse is not supported for Sentinel data in the Defender portal, has not changed. What is new is a separate, Defender-portal-native GDAP mechanism announced at RSAC, which is a different thing. These are not the same capability. If you were using Entra B2B as the interim path based on earlier guidance, that guidance was correct and that path remains the generally available option today. A few things I would genuinely like to hear from practitioners: For those who have activated Unified RBAC for a Sentinel workspace in the Defender portal: what did the migration from Azure RBAC roles look like in practice? Did the import function bring roles across cleanly, or did you find gaps particularly around custom roles? For environments using Playbook Operator, Automation Contributor, or Workbook Contributor role assignments: how are you handling the fact those three roles are not yet in Unified RBAC and still require Azure portal management? Is the dual-management posture creating operational friction? For MSSPs evaluating the new Defender-native GDAP model against their existing Entra B2B setup: what factors are driving the decision either way at your scale? Writing this up as Part 3 of the migration series and the community experience here is directly useful for making sure the practitioner angle is grounded.SolvedAnthonyPorterApr 20, 2026Brass Contributor256Views0likes3CommentsStuck looking up a watchlist value
Hiya, I get stuck working with watchlists sometimes. In this example, I'm wanting to focus on account activity from a list of UPNs. If I split the elements up, I get the individual results, but can't seem to pull it all together. ===================================================== In its entirety, the query returns zero results: let ServiceAccounts=(_GetWatchlist('ServiceAccounts_Monitoring'))| project SearchKey; let OpName = dynamic(['Reset password (self-service)','Reset User Password','Change user password','User reset password','User started password reset','Enable Account','Change password (self-service)','Update PasswordProfile','Self-service password reset flow activity progress']); AuditLogs | where OperationName has_any (OpName) | extend upn = TargetResources.[0].userPrincipalName | where upn in (ServiceAccounts) //<=This is where I think I'm wrong | project upn ===================================================== This line on its own, returns the user on the list: let ServiceAccounts=(_GetWatchlist('ServiceAccounts_Monitoring'))| project SearchKey; ===================================================== This section on its own, returns all the activity let OpName = dynamic(['Reset password (self-service)','Reset User Password','Change user password','User reset password','User started password reset','Enable Account','Change password (self-service)','Update PasswordProfile','Self-service password reset flow activity progress']); AuditLogs | where OperationName has_any (OpName) | extend upn = TargetResources.[0].userPrincipalName | where upn contains "username" //This is the name on the watchlistlist - so I know the activity exists) ==================================================== I'm doing something wrong when I'm trying to use the watchlist cache (I think) Any help\guidance or wisdom would be greatly appreciated! Many thanksSolvedMrDApr 01, 2026Copper Contributor85Views0likes2CommentsSecurity Copilot Integration with Microsoft Sentinel - Why Automation matters now
Security Operations Centers face a relentless challenge - the volume of security alerts far exceeds the capacity of human analysts. On average, a mid-sized SOC receives thousands of alerts per day, and analysts spend up to 80% of their time on initial triage. That means determining whether an alert is a true positive, understanding its scope, and deciding on next steps. With Microsoft Security Copilot now deeply integrated into Microsoft Sentinel, there is finally a practical path to automating the most time-consuming parts of this workflow. So I decided to walk you through how to combine Security Copilot with Sentinel to build an automated incident triage pipeline - complete with KQL queries, automation rule patterns, and practical scenarios drawn from common enterprise deployments. Traditional triage workflows rely on analysts manually reviewing each incident - reading alert details, correlating entities across data sources, checking threat intelligence, and making a severity assessment. This is slow, inconsistent, and does not scale. Security Copilot changes this equation by providing: Natural language incident summarization - turning complex, multi-alert incidents into analyst-readable narratives Automated entity enrichment - pulling threat intelligence, user risk scores, and device compliance state without manual lookups Guided response recommendations - suggesting containment and remediation steps based on the incident type and organizational context The key insight is that Copilot does not replace analysts - it handles the repetitive first-pass triage so analysts can focus on decision-making and complex investigations. Architecture - How the Pieces Fit Together The automated triage pipeline consists of four layers: Detection Layer - Sentinel analytics rules generate incidents from log data Enrichment Layer - Automation rules trigger Logic Apps that call Security Copilot Triage Layer - Copilot analyzes the incident, enriches entities, and produces a triage summary Routing Layer - Based on Copilot's assessment, incidents are routed, re-prioritized, or auto-closed (Forgive my AI-painted illustration here, but I find it a nice way to display dependencies.) +-----------------------------------------------------------+ | Microsoft Sentinel | | | | Analytics Rules --> Incidents --> Automation Rules | | | | | v | | Logic App / Playbook | | | | | v | | Security Copilot API | | +-----------------+ | | | Summarize | | | | Enrich Entities | | | | Assess Risk | | | | Recommend Action| | | +--------+--------+ | | | | | v | | +-----------------------------+ | | | Update Incident | | | | - Add triage summary tag | | | | - Adjust severity | | | | - Assign to analyst/team | | | | - Auto-close false positive| | | +-----------------------------+ | +-----------------------------------------------------------+ Step 1 - Identify High-Volume Triage Candidates Not every incident type benefits equally from automated triage. Start with alert types that are high in volume but often turn out to be false positives or low severity. Use this KQL query to identify your top candidates: SecurityIncident | where TimeGenerated > ago(30d) | summarize TotalIncidents = count(), AutoClosed = countif(Classification == "FalsePositive" or Classification == "BenignPositive"), AvgTimeToTriageMinutes = avg(datetime_diff('minute', FirstActivityTime, CreatedTime)) by Title | extend FalsePositiveRate = round(AutoClosed * 100.0 / TotalIncidents, 1) | where TotalIncidents > 10 | order by TotalIncidents desc | take 20 This query surfaces the incident types where automation will deliver the highest ROI. Based on publicly available data and community reports, the following categories consistently appear at the top: Impossible travel alerts (high volume, around 60% false positive rate) Suspicious sign-in activity from unfamiliar locations Mass file download and share events Mailbox forwarding rule creation Step 2 - Build the Copilot-Powered Triage Playbook Create a Logic App playbook that triggers on incident creation and leverages the Security Copilot connector. The core flow looks like this: Trigger: Microsoft Sentinel Incident - When an incident is created Action 1 - Get incident entities: let incidentEntities = SecurityIncident | where IncidentNumber == <IncidentNumber> | mv-expand AlertIds | join kind=inner (SecurityAlert | extend AlertId = SystemAlertId) on $left.AlertIds == $right.AlertId | mv-expand Entities | extend EntityData = parse_json(Entities) | project EntityType = tostring(EntityData.Type), EntityValue = coalesce( tostring(EntityData.HostName), tostring(EntityData.Address), tostring(EntityData.Name), tostring(EntityData.DnsDomain) ); incidentEntities Note: The <IncidentNumber> placeholder above is a Logic App dynamic content variable. When building your playbook, select the incident number from the trigger output rather than hardcoding a value. Action 2 - Copilot prompt session: Send a structured prompt to Security Copilot that requests: Analyze this Microsoft Sentinel incident and provide a triage assessment: Incident Title: {IncidentTitle} Severity: {Severity} Description: {Description} Entities involved: {EntityList} Alert count: {AlertCount} Please provide: 1. A concise summary of what happened (2-3 sentences) 2. Entity risk assessment for each IP, user, and host 3. Whether this appears to be a true positive, benign positive, or false positive 4. Recommended next steps 5. Suggested severity adjustment (if any) Action 3 - Parse and route: Use the Copilot response to update the incident. The Logic App parses the structured output and: Adds the triage summary as an incident comment Tags the incident with copilot-triaged Adjusts severity if Copilot recommends it Routes to the appropriate analyst tier based on the assessment Step 3 - Enrich with Contextual KQL Lookups Security Copilot's assessment improves dramatically when you feed it contextual data. Before sending the prompt, enrich the incident with organization-specific signals: // Check if the user has a history of similar alerts (repeat offender vs. first time) let userAlertHistory = SecurityAlert | where TimeGenerated > ago(90d) | mv-expand Entities | extend EntityData = parse_json(Entities) | where EntityData.Type == "account" | where tostring(EntityData.Name) == "<UserPrincipalName>" | summarize PriorAlertCount = count(), DistinctAlertTypes = dcount(AlertName), LastAlertTime = max(TimeGenerated) | extend IsRepeatOffender = PriorAlertCount > 5; userAlertHistory // Check user risk level from Entra ID Protection AADUserRiskEvents | where TimeGenerated > ago(7d) | where UserPrincipalName == "<UserPrincipalName>" | summarize arg_max(TimeGenerated, RiskLevel), RecentRiskEvents = count() | project RiskLevel, RecentRiskEvents Including this context in the Copilot prompt transforms generic assessments into organization-aware triage decisions. A "suspicious sign-in" for a user who travels internationally every week is very different from the same alert for a user who has never left their home country. Step 4 - Implement Feedback Loops Automated triage is only as good as its accuracy over time. Build a feedback mechanism by tracking Copilot's assessments against analyst final classifications: SecurityIncident | where Tags has "copilot-triaged" | where TimeGenerated > ago(30d) | where Classification != "" | mv-expand Comments | extend CopilotAssessment = extract("Assessment: (True Positive|False Positive|Benign Positive)", 1, tostring(Comments)) | where isnotempty(CopilotAssessment) | summarize Total = dcount(IncidentNumber), Correct = dcountif(IncidentNumber, (CopilotAssessment == "False Positive" and Classification == "FalsePositive") or (CopilotAssessment == "True Positive" and Classification == "TruePositive") or (CopilotAssessment == "Benign Positive" and Classification == "BenignPositive") ) by bin(TimeGenerated, 7d) | extend AccuracyPercent = round(Correct * 100.0 / Total, 1) | order by TimeGenerated asc For this query to work reliably, the automation playbook must write the assessment in a consistent format within the incident comments. Use a structured prefix such as Assessment: True Positive so the regex extraction remains stable. According to Microsoft's published benchmarks and community feedback, Copilot-assisted triage typically achieves 85-92% agreement with senior analyst classifications after prompt tuning - significantly reducing the manual triage burden. A Note on Licensing and Compute Units Security Copilot is licensed through Security Compute Units (SCUs), which are provisioned in Azure. Each prompt session consumes SCUs based on the complexity of the request. For automated triage at scale, plan your SCU capacity carefully - high-volume playbooks can accumulate significant usage. Start with a conservative allocation, monitor consumption through the Security Copilot usage dashboard, and scale up as you validate ROI. Microsoft provides detailed guidance on SCU sizing in the official Security Copilot documentation. Example Scenario - Impossible Travel at Scale Consider a typical enterprise that generates over 200 impossible travel alerts per week. The SOC team spends roughly 15 hours weekly just triaging these. Here is how automated triage addresses this: Detection - Sentinel's built-in impossible travel analytics rule flags the incidents Enrichment - The playbook pulls each user's typical travel patterns from sign-in logs over the past 90 days, VPN usage, and whether the "impossible" location matches any known corporate office or VPN egress point Copilot Analysis - Security Copilot receives the enriched context and classifies each incident Expected Result - Based on common deployment patterns, around 70-75% of impossible travel incidents are auto-closed as benign (VPN, known travel patterns), roughly 20% are downgraded to informational with a triage note, and only about 5% are escalated to analysts as genuine suspicious activity This type of automation can reclaim over 10 hours per week - time that analysts can redirect to proactive threat hunting. Getting Started - Practical Recommendations For teams ready to implement automated triage with Security Copilot and Sentinel, here is a recommended approach: Start small. Pick one high-volume, high-false-positive incident type. Do not try to automate everything at once. Run in shadow mode first. Have the playbook add triage comments but do not auto-close or re-route. Let analysts compare Copilot's assessment with their own for two to four weeks. Tune your prompts. Generic prompts produce generic results. Include organization-specific context - naming conventions, known infrastructure, typical user behavior patterns. Monitor accuracy continuously. Use the feedback loop KQL above. If accuracy drops below 80%, pause automation and investigate. Maintain human oversight. Even at 90%+ accuracy, keep a human review step for high-severity incidents. Automation handles volume - analysts handle judgment. The combination of Security Copilot and Microsoft Sentinel represents a genuine step forward for SOC efficiency. By automating the initial triage pass - summarizing incidents, enriching entities, and providing classification recommendations - analysts are freed to focus on what humans do best: making nuanced security decisions under uncertainty. Feel free to like or/and connect :)241Views0likes0CommentsWebinar Cancellation
Hi everyone! The webinar originally scheduled for April 14th on "Using distributed content to manage your multi-tenant SecOps" has unfortunately been cancelled for now. We apologize for the inconvenience and hope to reschedule it in the future. Please find other available webinars at: http://aka.ms/securitycommunity All the best, The Microsoft Security Community Team128Views0likes0Comments
Tags
- siem452 Topics
- KQL309 Topics
- data collection246 Topics
- Log Data227 Topics
- analytics167 Topics
- azure160 Topics
- automation148 Topics
- integration141 Topics
- alerts128 Topics
- kusto127 Topics