microsoft sentinel
838 TopicsAsk Microsoft Anything: The Microsoft Sentinel SIEM Migration Experience
***Update: The event has concluded, but if you missed it- you can watch the recording ON-DEMAND on this page at any time. Happy learning! Join us for a live demo and AMA on the Microsoft Sentinel SIEM migration experience. We’ll show how the experience helps teams move from legacy SIEMs like Splunk and QRadar into Microsoft Sentinel with a more guided, lower-friction path. We’ll cover what it does today, how it works, and the questions customers ask most, then open it up for live Q&A. What is an AMA? An 'Ask Microsoft Anything' (AMA) session is an opportunity for you to engage directly with Microsoft employees! This AMA will consist of a short presentation followed by taking questions on-camera from the comment section down below! Ask your questions/give your feedback and we will have our awesome Microsoft Subject Matter Experts engaging and responding directly in the video feed. We know this timeslot might not work for everyone, so feel free to ask your questions at any time leading up to the event and the experts will do their best to answer during the live hour. This page will stay up so come back and use it as a resource anytime. We hope you enjoy!2KViews7likes13CommentsA guide to innovating threat hunting with Microsoft Sentinel custom graph
Microsoft Sentinel platform offers a growing list of tools and features, with graph being a cornerstone capability. Sentinel graph is a relationship-first method for organizing and querying data within Microsoft Sentinel data lake. Activities amongst entities (users, devices, emails, IPs, applications, etc.) become a navigable structure that avoids a complex table structure. Rather than stitching together data and evidence via complex joins, users can follow multi-hop connections in order to understand insights such as blast radius, unseen pivots in malicious behavior, and investigative details that may not be as obvious within regular logs, all while visualizing these paths to assist in communicating evidence and findings. This blog will walk through how to create custom graphs using GitHub Copilot chat experiences in Sentinel VS Code. And how to leverage out-of-the-box graph samples to build custom graphs addressing security outcomes. Custom graphs are available in public preview. Prerequisites and Tooling Sentinel data lake enabled in the tenant, this is where the data for the graph will be stored. Users will need read/write permissions on Sentinel data lake data. And either security operator or security admin permissions to save a custom graph in the tenant. Visual Studio Code (VS Code) will need to be installed, as it is essential for building and saving graphs. The Jupyter notebook extension, Microsoft Sentinel extension, and GitHub Copilot extension will need to be installed from within VS Code. These are key pieces for configuring and managing graphs. (Optional) Microsoft Sentinel MCP server if using MCP tools like the data exploration tool. Building a new custom graph The starting point is within Visual Studio Code (VS Code), where the custom graph will be built via GitHub Copilot and the Sentinel graph authoring tool. Make sure to have a GitHub account logged in within VS Code, then start a chat with Copilot via View > Chat. This will open a chat window on the right side of the screen. Determining security telemetry for investigation If unsure about which tables are available within the environment or the columns to focus on for hunting/investigations, turn to the Sentinel MCP server. With the Sentinel MCP server, users can explore the threat landscape within their environment as well as see which data sources currently exist within the Sentinel data lake. This process can be done using natural language with Copilot to obtain the information needed to perform the task at hand. “List the most important tables within my Microsoft Sentinel data lake environment that would build a blast radius for a compromised user account. List the best columns to use for this scenario. Format the response as a table” The tables and columns that can be used are now known. The next step is to use these tables to construct a custom graph with help from GitHub Copilot. For this example, a blast radius graph will be built to assist in reviewing the impact of compromised accounts within the environment: “List the top 5 compromised or targeted accounts within my environment. List which types of attacks are involved with those accounts. Summarize the information into a simple to read table” Given this response, there are a few options for going forward: Return to the Microsoft Defender portal and attempt threat hunting/review this with other analysts Ask Copilot to provide threat hunting queries or perform incident investigations for the top users who are most targeted Build custom graphs to visualize threat data around the most targeted accounts For this example, we will use option 3. Building graph mappings with GitHub Copilot To begin building a custom graph from scratch, a new prompt is submitted, this time tagging the Sentinel extension’s graph authoring tool. An example of the type of prompt to use is below: “@Sentinel /graph-authoring I want to investigate the blast radius of a compromised user and what systems/ app/ devices that they accessed based on users authentication activity. Please use at least SignInLogs, NonInteractivelogs, DeviceLogon, Onprem AD logs, IdentityInfo, and AADRiskyUsers. The graph should help investigate the following security outcomes: What is the user's current risk level and risk score from Identity Protection? Which applications and resources did a user authenticate to? Are there sign-ins from risky IP addresses, Tor exit nodes, or anonymizers? Are there non-interactive sign-ins from unexpected locations or devices? Which machines did a user log on to locally/remotely (RDP)? Which user accounts have been active on a compromised device? A few guidance for data ingestion: Ensure to filter out any data that has NULL or empty values for key Nodes and Edges Filter all data for last 14 days Do not map json arrays as Keys in Nodes or Edges” Note: To ensure that the graph that is written matches the desired scenario, it helps to provide outcomes or guidance to the graph authoring tool. If a Juypter notebook is not already open within the VS Code, Copilot will build a new notebook based on the prompt given. Once Copilot is done, select a kernel to run the notebook. This can be done from the top right of the Notebook: Click on Select Kernel. Click on Microsoft Sentinel. Choose a pool option for the compute cluster. Once a pool is picked, click on the run button next to one of the code cells to boot up the compute pool (this can take up to 5 minutes) Once connected, users can either go through and click the run button next to the code cell to run the code or click the Run All button at the top of the Notebook. For each cell in the Notebook: Cell 2 This section of the notebook is for mporting the sentinel_graph library and configures Spark settings. This is essentially setting up the notebook environment for executing the rest of the code. from sentinel_graph import notebook notebook.requires(sentinel_graph="0.3.8") spark.conf.set("spark.sql.parquet.datetimeRebaseModeInRead", "CORRECTED") Cell 3 This section is performing more Sentinel specific configurations by defining which Sentinel workspace to use, which timerange to use, which tables to use, etc. This is defining which data sources should be considered when building the graph. from pyspark.sql import functions as F from sentinel_lake.providers import MicrosoftSentinelProvider lake_provider = MicrosoftSentinelProvider(spark=spark) LOG_ANALYTICS_WORKSPACE = "Woodgrove-LogAnalyiticsWorkspace" # Auto-detected from the Microsoft Sentinel extension TARGET_USER = "ram723@int.zava-private.com" # Time filter — 7 days for broader blast radius context time_filter = F.col("TimeGenerated") >= F.expr("current_timestamp() - INTERVAL 7 DAYS") # --- IdentityInfo: user profile, roles, group memberships, risk --- df_identity_info = ( lake_provider.read_table("IdentityInfo", LOG_ANALYTICS_WORKSPACE) .filter(time_filter) .filter(F.lower(F.col("AccountUPN")) == TARGET_USER.lower()) ) # --- SigninLogs: interactive sign-ins to resources --- df_signins = ( lake_provider.read_table("SigninLogs", LOG_ANALYTICS_WORKSPACE) .filter(time_filter) .filter( (F.lower(F.col("UserPrincipalName")) == TARGET_USER.lower()) & (F.col("ResultType") == "0") # successful sign-ins ) ) Cell 4 This section is defining and building the nodes that will be used in the graph. The definitions include what events look like, which entities are involved, and how they are considered for each node type. # 1. User node (the target user) user_nodes = ( df_identity_info .select( F.col("AccountUPN"), F.col("AccountDisplayName"), F.col("RiskLevel"), F.col("RiskState"), F.col("AssignedRoles"), F.col("GroupMembership"), F.col("BlastRadius"), F.col("Department"), F.col("JobTitle"), F.col("IsMFARegistered"), F.col("IsAccountEnabled") ) .distinct() .withColumn("AccountUPN", F.lower(F.col("AccountUPN"))) ) Cell 5 This section is building out the schema for the graph. The schema for a graph is taking the columns and details from the tables in cell 3 while also tying them to the nodes and edges built in cell 4. # Build nodes first builder = ( GraphSpecBuilder.start() # === NODES === .add_node("User") .from_dataframe(user_nodes) .with_columns("AccountUPN", "AccountDisplayName", "RiskLevel", "RiskState", "AssignedRoles", "GroupMembership", "BlastRadius", "Department", "JobTitle", "IsMFARegistered", "IsAccountEnabled", key="AccountUPN", display="AccountUPN") # Then add edges and finalise into a GraphSpec spec = ( builder # === EDGES === .add_edge("AccessedInteractive") .from_dataframe(edge_user_resource_interactive) .source(id_column="UserUPN", node_type="User") .target(id_column="ResourceName", node_type="Resource") .with_columns("AppDisplayName", "TimeGenerated", "IPAddress", "ConditionalAccessStatus", "AccessType", "EdgeKey", key="EdgeKey", display="AccessType") Cell 6 This cell will take the schema from cell 5 and will load it into the graph visual builder. This will give a sample of what the graphs made with this Notebook will look like. These samples are fully interactive and will give an example of how it will look within the Defender portal. For example: Please note that the Authoring Agent may provide a different looking schema if following along with this example. The schema above is just meant to provide an example of what one will look like within a Notebook. Cell 7 This cell is taking each of the following steps performed and is going to compile and build the graph based on the data from the Sentinel data lake. This may take a few minutes to perform. With the custom graph built, the next step is to create a Graph Job to save the custom graph in the tenant for persistent use. If necessary, users can go back into the notebook to refine, expand, and improve the custom graph. Publishing graph Publishing a graph is the process of saving the graph in a tenant, allowing for the graph to be scheduled for recurring refreshes or as needed. This process saves the graph to the tenant and enables other SOC members to access this graph from within the Defender portal. To publish a custom graph, this must go through a Graph Job. This option is available within the Notebook experience as a button near the top: Clicking on the Create Scheduled Job button will open a new tab within VS Code with the jobs settings and the option to publish: There are two types of job schedules: On Demand: Saves the custom graph to the tenant and will persist the custom graph for 30 days. After 30 days, the graph will be auto deleted. Scheduled: Saves the custom graph to the tenant and will rebuild with new security telemetry based on a user defined schedule. Once everything is prepped, the custom graph can be published to the tenant by hitting the Submit button. Users can view and monitor the creation progress by finding the graph within the Sentinel extension navigation as it shows the graphs available for the environment: Finding and selecting the custom graph will open up a new tab that shows details around the graph. This includes details around the name, creation status (creating, ready, etc), author, and publishing date. Near the top, there are tabs for Job Details and Graph Query. These options allow the user to review the current Graph Job, make changes to the Graph Job, or query the graph within the notebook. Querying the graph in Defender Once the custom graph has been published and the creation status is Ready, users can query the new graph in the Defender Portal: Expand the Microsoft Sentinel navigation. Select Graphs. Either find the card with the graph title or search for it within the menu. Once found, click Query Graph to open it. The graph will open in the schema view. The schema here is a visual representation of which nodes, edges, and relations are part of the graph. This is what was built in the notebook. To query it, a user can write GQL queries or use ones that are provided. For this example, a query provided in the Getting Started tab will be used. This is a generic query that will show everything in a graph: // Visualize any graph MATCH (x)-[y]->(z) RETURN * LIMIT 100 More focused queries will yield more focused results. For example: MATCH (n_user:User)-[e_ip:SignedInFrom]->(n_ip:IPAddress) MATCH (n_user)-[e_signin:InteractiveSignIn]->(n_app:Application) WHERE n_user.UserPrincipalName = 'ENTERUSERNAMHERE' AND n_ip.IPAddress = 'IPADDRESSHERE' RETURN n_user, e_ip, n_ip, e_signin, n_app MATCH (n_user)-[x]->() MATCH (n_user)-[e_signin:InteractiveSignIn]->(n_app:Application) WHERE n_user.UserPrincipalName = 'ENTERUSERNAMEHERE' RETURN * From here, a user can continue the hunt, remediate the concerns, escalate this for further attention and remediation, or refine the graph as needed. Refining Graphs Throughout the process, the custom graph may need to be updated for various reasons, including: The scope of the hunt/investigation has expanded due to new information or the hypothesis being updated based on findings The original hypothesis of the hunt was incorrect or needs to be changed Important nodes are missing from the graph and need to be added To achieve this, return to VS Code and use the GitHub Copilot chat experience to add new telemetry, nodes, edges, or properties in the existing graph. The below example illustrates adding Azure resources as new assets by prompting the Sentinel graph authoring tool and instructing it on what needs to be added. Running the cells of the Notebook will yield an updated graph that includes the new changes: Graph samples in the Sentinel VS Code extension To help with learning, building, and using Sentinel graph, there are 5 graph samples included in the Sentinel extension within VS Code. These can be found by clicking on the Sentinel extension and looking under Notebook Samples > Graphs. Each graph included contains a Jupyter notebook containing the graph schema and mappings, as well as graph queries which can be run against the graph. These graphs ingest certain security telemetry and expect them to already exist within the Sentinel lake instance that is being used. If needed, the graph mapping can be updated to include/ exclude security telemetry as needed. These graph samples are also located within the Sentinel GitHub repository. Let’s look at one of the sample graphs – Phishing Email Killchain to understand how it can help during a security investigation. Using a graph: phishing email kill chain scenario Phishing is the number one initial access vector, yet investigating a phishing campaign requires correlating data across multiple Sentinel tables: EmailEvents, EmailUrlInfo, UrlClickEvents, EmailAttachmentInfo, DeviceFileEvents, and DeviceProcessEvents. Each table uses a different join key (NetworkMessageId, AccountUpn, SHA256, DeviceName), and analysts must stitch results together manually across several Defender portals. The core question every SOC analyst needs to answer is: “Who received the email, clicked the URL, downloaded the attachment, and executed it on their device?” In KQL, answering this requires 5+ sequential queries and 30–60 minutes of manual correlation. The Phishing Email Kill Chain graph fuses all of these tables into a single connected structure with 10 node types and 12 edge types, making it possible to answer that question in seconds with a single GQL traversal. SOC teams can create this graph in their tenant and start investigating phishing campaigns using graph-powered insights. Investigation with the Phishing Email Killchain graph Multi-hop traversal. The full kill chain from email to endpoint execution is a 4-hop path: Email → Attachment → Process → Device. In KQL, each hop is a separate join with a different key column. In the graph, it’s one MATCH clause. Structural detection. Campaign topology is visible as the graph’s shape — senders fanning out to emails, emails fanning out to users, shared URLs converging into hubs. These patterns are structural properties requiring no aggregation queries. Click-exposure overlay. The graph overlays email delivery and URL click paths in a single view. An analyst instantly sees which users received a phishing email AND clicked the embedded URL — no separate UrlClickEvents join needed. Example queries Below are three queries from the published phishing_email_killchain graph that demonstrate these capabilities. Each query is a single GQL statement that replaces multiple KQL joins. Query 1: Full Kill Chain — Email to Endpoint This query traces the complete attack path: phishing email → malicious attachment → process execution → endpoint device. In KQL, this requires joining 4 tables with different keys and temporal proximity filtering. MATCH (e:Email)-[ha:HasAttachment]->(att:Attachment) -[tp:TriggeredProcess]->(p:Process)-[od:OnDevice]->(d:Device) RETURN e, ha, att, tp, p, od, d LIMIT 10 Figure 1: Two complete kill chains — Invoice_Q3.xlsm → EXCEL.EXE → DESKTOP-FIN01 and DocuSign_Contract.pdf.exe → cmd.exe → DESKTOP-SALES02. Each path is one traversal replacing 4+ KQL joins. Query 2: Campaign Topology — Sender to Email to User to URL This query visualizes the full campaign structure: which senders sent which emails, who received them, and what URLs were embedded. The graph’s fan-out shape immediately reveals the blast radius and shared infrastructure. MATCH (s:Sender)-[se:Sent]->(e:Email)-[re:ReceivedEmail]->(u:User), (e)-[cu:ContainsUrl]->(url:Url) RETURN s, se, e, re, u, cu, url LIMIT 10 Figure 2: Campaign topology — 2 senders, 2 emails fanning out to 9 users and 2 URLs. The shared URL node (c0ntoso-share...) receiving edges from both emails reveals coordinated campaign infrastructure. Query 3: URL Click Exposure — Who Clicked the Phishing Links This query shows which emails contained URLs and which users clicked them. The Email → URL → User click chain is a single traversal that replaces joining EmailUrlInfo with UrlClickEvents. MATCH (e:Email)-[cu:ContainsUrl]->(url:Url)<-[cl:ClickedUrl]-(u:User) RETURN e, cu, url, cl, u LIMIT 10 Figure 3: Click exposure — 3 users clicked phishing URLs from 3 different emails. Each cluster shows Email → URL → User, instantly identifying click-through victims. These are just 3 examples of what is possible when using GQL on a graph. Users can author their own GQL queries to run on this graph to show other possibilities. Additional graph samples As mentioned, the Phishing Email Killchain graph is one of five graph samples that are available today for use within the VS Code Sentinel Extension. The remaining graphs are: Behavioral Attack Chain Ingests data from the SentinelBehaviorInfo, SentinelBehaviorEntities, AlertInfo, AlertEvidence, ThreatIntelIndicators, and BehaviorAnalytics tables to model the relationships between different detections, MITRE tactics/techniques, entities, and threat intel to high different traversals that are difficult to do with just KQL alone. Databricks Outbound Exfiltration Ingests data from the DatabricksNotebook, DatabricksSecrets, DatabricksDBFS, DatabricksClusters, DatabricksJobs, DatabricksSQLPermissions, IdentityInfo, AADUserRiskEvents, and BehaviorAnalytics tables to map Databricks notebook and cluster activities to the identities used in order to enable detections of unusual outbound data movement, privilege escalation, and data exfiltration patterns. DNS C2 Beaconing Ingests data from the DeviceNetworkEvents, DeviceInfo, and ThreatIntelIndicators to model DNS resolution patterns to detect C2 beaconing and other malicious patterns. OAuth Privilege Escalation Ingests data from the EntraServicePrincipals, AADRiskyServicePrincipals, and AADServicePrincipalSignInLogs tables to trace OAuth consent chains, credential abuse, and privilege escalation paths to identify hub users, over-permissions identities, and backdoor patterns that may exist. Closing This blog showcased an example of how a custom graph can be made with data within Microsoft Sentinel data lake and the help of GitHub Copilot, investigating a phishing email kill chain situation, and how to leverage the several graph templates that are provided in Sentinel. Get started today by using one of the template graphs, building your own graph, or by checking out the public documentation for Sentinel graph. Note: Custom graph API usage for creating graph and querying graph will be billed according to the Sentinel graph meter. Public Documentation: https://learn.microsoft.com/azure/sentinel/datalake/sentinel-graph-overview GQL Reference: Graph Query Language (GQL) reference for Microsoft Sentinel graph (Preview) | Microsoft Learn Planning graph Costs: Plan costs and understand pricing and billing - Microsoft Sentinel | Microsoft Learn551Views1like1CommentMigrate Sentinel to Defender - Why It Is a Security Architecture Decision, Not Just a Portal Change
Microsoft will retire the Sentinel experience in Azure on March 31, 2027. Most of the conversation around this transition focuses on cost optimization and portal consolidation. That framing undersells what is actually happening. The unified Defender portal is not a new interface for the same capabilities. It is the platform foundation for a fundamentally different SOC operating model — one built on a 2-tier data architecture, graph-based investigation, and AI agents that can hunt, enrich, and respond at machine speed. Partners who understand this will help customers build security programs that match how attackers actually operate. This document covers four things: What the unified experience delivers — the security capabilities that do not exist in standalone Sentinel and why they matter against today’s threats. What the transition really involves - is not data migration, but it is a data architecture project that changes how telemetry flows, where it lives, and who queries it. Where the partner opportunity lives — a structured progression from professional services (transactional, transition execution, and advisory) to ongoing managed security services. Why does the unified experience win competitively — factual capability advantages that give partners a defensible position against third-party SIEM alternatives. The Bigger Picture: Preparing for the Agentic SOC Before getting into transition mechanics, partners need to understand where the industry is headed — because the platform decisions made during this transition will determine whether a customer’s SOC is ready for what comes next. The security industry is moving from human-driven, alert-centric workflows to an operating model built on three pillars: Intellectual Property — the detection logic, hunting hypotheses, response playbooks, and domain expertise that differentiate one security team from another. Human Orchestration — the judgment, context, and decision-making that humans bring to complex incidents. Humans set strategy, validate findings, and make containment decisions. They do not manually triage every alert. AI Agents - built agents that execute repeatable work: enriching incidents, hunting across months of telemetry, validating security posture, drafting response actions, and flagging anomalies for human review. The SOC of 2027 will not be scaled by hiring more analysts. It will be scaled by deploying agents that encode institutional knowledge into automated workflows — orchestrated by humans who focus on the decisions that require judgment. This transformation requires a platform that provides three things: Deep telemetry — agents need months of queryable data to analyze behavioral patterns, build baselines, and detect slow-moving threats. The Sentinel data lake provides this at a cost point that makes long-retention feasible. Relationship context — agents need to understand how entities connect. Which accounts share credentials? What is the blast radius of a compromised service principle? What is the attack path from a phished user to domain admin? Sentinel Graph provides this. Extensibility — partners and customers need to build and deploy their own agents without waiting for Microsoft to ship them. The MCP framework and Copilot agent architecture provide this. None of these exist in Azure experience for Sentinel. All three ship with the Defender experience. The urgency goes beyond the March 2027 deadline. Organizations are deploying AI agents, copilots, and autonomous workflows across their businesses — and every one of those creates a new attack surface. Prompt injection, data poisoning, agent hijacking, cross-plugin exploitation — these are not theoretical risks. They are in the wild today. Defending against AI-powered attacks requires a security platform that is itself AI Agent-ready. The new experience in Defender unlocks this experience. What Unified SIEM and XDR Actually Delivers The original framing — “single pane of glass for SIEM and XDR” — is accurate but insufficient. Here is what the unified platform delivers that standalone Sentinel does not. Cross-Domain Incident Correlation The Defender correlation engine does not just group alerts by time proximity. It builds multi-stage incident graphs that link identity compromise to lateral movement to data exfiltration across SIEM and XDR telemetry — automatically. Consider a token theft chain: an infostealer harvests browser session cookies (endpoint telemetry), the attacker replays the token from a foreign IP (Entra ID sign-in logs), creates a mailbox forwarding rule (Exchange audit logs), and begins exfiltrating data (DLP alerts). In standalone Sentinel, these are four separate alerts in four different tables. In the unified platform, they are one correlated incident with a visual attack timeline. 2-Tier Data Architecture The Sentinel data lake introduces a second storage tier that changes the economics and capabilities of security telemetry: Analytics Tier Data Lake Purpose Real-time detection rules, SOAR, alerting Hunting, forensics, behavioral analysis, AI agent queries Latency Sub-5-minute query and alerting Minutes to hours acceptable Cost ~$4.30/GB PAYG ingestion (~$2.96 at 100 GB/day commitment) ~$0.05/GB ingestion + $0.10/GB data processing (at least 20x cheaper) Retention 90 days default (expensive to extend) Up to 12 years at low cost Best for High-signal, low-volume sources High-volume, investigation-critical sources The architecture decision is not “which tier is cheaper.” It is “which tier gives me the right detection capability for each data source.” Analytics tier candidates: Entra ID sign-in logs, Azure activity, audit logs, EDR alerts, PAM events, Defender for Identity alerts, email threat detections. These need sub-5-minute alerting. Data lake candidates: Raw firewall session logs, full DNS query streams, proxy request logs, Sysmon process events, NSG flow logs. These drive hunting and forensic analysis over weeks or months. Dual-ingest sources: Some sources need both tiers. Entra ID sign-in logs are the canonical example — analytics tier for real-time password spray detection, Data Lake for graph-based blast radius analysis across months of authentication history. Implementation is straightforward: a single Data Collection Rule (DCR) transformation handles the split. One collection point, two routing destinations. The right framing: “Right data in the right tier = better detections AND lower cost.” Cost savings are a side effect of good security architecture, not the goal. Sentinel Graph Sentinel graph enables SOC teams and AI agents to answer questions that flat log queries cannot: What is the blast radius of this compromised account? Which service principals share credentials with the breached identity? What is the attack path from this phished user to domain admin? Which entities are connected to this suspicious IP across all telemetry sources? Graph-based investigation turns isolated alerts into context-rich intelligence. It is the difference between knowing “this account was compromised” and understanding “this account has access to 47 service principals, 3 of which have written access to production Key Vault.” Security Copilot Integration Security Copilot embedded in the defender portal helps analysts summarize incidents, generate hunting queries, explain attacker behavior, and draft response actions. For complex multi-stage incidents, it reduces the time from “I see an alert” to “I understand the full scope” from hours to minutes. With free SCUs available with Microsoft 365 E5, teams can apply AI to the highest-effort investigation work without adding incremental cost. MCP and the Agent Framework The Model Context Protocol (MCP) and Copilot agent architecture let partners and customers build purpose-built security agents. A concrete example: an MCP-enabled agent can automatically enrich a phishing incident by querying email metadata, checking the sender against threat intelligence, pulling the user’s recent sign-in patterns, correlating with Sentinel Graph for lateral risk, and drafting a containment recommendation — in under 60 seconds. This is where partner intellectual property becomes competitive advantage. The agent framework is the mechanism for encoding proprietary detection logic, response playbooks, and domain expertise into automated workflows that run at machine speed. Security Store Security Store allows partners to evolve from one‑time transition projects into repeatable, scalable offerings—supporting professional services, managed services, and agent‑based IP that align with the customer’s unified SecOps operating model As part of the transition, the Microsoft Security Store becomes the extension layer for the Defender —allowing partners to deliver differentiated agents, SaaS, and security services natively within Defender and Sentinel, instead of building and integrating in isolation The 4 Investigation Surfaces: A Customer Maturity Ladder The Sentinel Data Lake exposes four distinct investigation surfaces, each representing a step toward the Agentic SOC — and a partner service opportunity: Surface Capability Maturity Level Partner Opportunity KQL Query Ad-hoc hunting, forensic investigation Basic — “we can query” Hunting query libraries; KQL training Graph Analytics Blast radius, attack paths, entity relationships Intermediate — “we understand relationships” Graph investigation training; attack path workshops Notebooks (PySpark) Statistical analysis, behavioral baselines, ML models Advanced — “we predict behaviors” Custom notebook development; anomaly scoring Agent/MCP Access Autonomous hunting, triage, response at machine speed Agentic SOC — “we automate” Custom agent development; MCP integration The customer who starts with “help us hunt better” ends up at “build us agents that hunt autonomously.” That is the progression from professional services to managed services. What the Transition Actually Involves It is not a data migration — customers’ underlying log data and analytics remain in their existing Log Analytics workspaces. That is important for partners to communicate clearly. But partners should not set the expectation that nothing changes except the URL. Microsoft’s official transition guide documents significant operational changes — including automation rules and playbooks, analytics rule, RBAC restructuring to the new unified model (URBAC), API schema changes that break ServiceNow and Jira integrations, analytics rule transitions where the Fusion engine is replaced by the Defender XDR correlation engine, and data policy shifts for regulated industries. Most customers cannot navigate this complexity without professional help. Important: Transitioning to the Defender portal has no extra cost - estimate the billing with the new Sentinel Cost Estimator Optimizing the unified platform means making deliberate changes: Adding dual-ingest for critical sources that need both real-time detection and long-horizon hunting. Moving high-volume telemetry to the Data Lake — enabling hunting at scale that was previously cost-prohibitive. Retiring redundant data copies where Defender XDR already provides the investigation capability. Updating RBAC, automation, and integrations for the unified portal’s consolidated schema and permission structure. Training analysts on new investigation workflows, Sentinel Graph navigation, and Copilot-assisted triage. Threat Coverage: The Detection Gap Most Organizations Do Not Know They Have This transition is an opportunity to quantify detection maturity — and most organizations will not like what they find. Based on real-world breach analysis — infostealers, business email compromise, human-operated ransomware, cloud identity abuse, vulnerability exploitation, nation-state espionage, and other prevalent threat categories — organizations running standalone Sentinel with default configurations typically have significant detection gaps. Those gaps cluster in three areas: Cross-domain correlation gaps — attacks that span identity, endpoint, email, and cloud workloads. These require the Defender correlation engine because no single log source tells the complete story. Long-retention hunting gaps — threats like command-and-control beaconing and slow data exfiltration that unfold over weeks or months. Analytics-tier retention at 90 days is too expensive to extend and too short for historical pattern analysis. Graph-based analysis gaps — lateral movement, blast radius assessment, and attack path analysis that require understanding entity relationships rather than flat log queries. The unified platform with proper log source coverage across Microsoft-native sources can materially close these gaps — but only if the transition includes a detection coverage assessment, not just a portal cutover. Partners should use MITRE ATT&CK as the common framework for measuring detection maturity. Map existing detections to ATT&CK tactics and techniques before and after transition — a measurable, defensible improvement that justifies advisory fees and ongoing managed services. Partner Opportunity: Professional Services to Managed Services This transition creates a structured progression for all partner types — from professional services that build trust and surface findings, to managed security services that deliver ongoing value. The key insight most partners miss: do not jump from “transition assessment” to “managed services pitch.” Customers are not ready for that conversation until they have experienced the value of professional services. The bridge engagement — whether transactional, transition execution, or advisory — builds trust, demonstrates the expertise, and surfaces the findings that make the managed services conversation a logical next step. Professional Services (transactional + transition execution + advisory) → Managed Security Services (MSSP) The USX transition is the ideal professional services entry point because it combines a mandatory deadline (March 2027) with genuine technical complexity (analytics rule, automation behavioral changes, RBAC restructuring, API schema shifts) that most customers cannot navigate alone. Every engagement produces findings — detection gaps, automation fragility, staffing shortfalls — that are the most credible possible evidence for managed services. Professional Services Transactional Partners Offer Customer Value Key Deliverables Transition Readiness Assessment Risk-mitigated transition with clear scope Sentinel deployment inventory; Defender portal compatibility check; transition roadmap with timeline; MITRE ATT&CK detection coverage baseline Transition Execution and Enablement Accelerated time-to-value, minimal disruption Workspace onboarding; RBAC and automation updates; Dual-portal testing and validation; SOC team training on unified workflows Security Posture and Detection Optimization Better detections and lower cost Data ingestion and tiering strategy; Dual-ingest implementation for critical sources; Detection coverage gap analysis; Automation and Copilot/MCP recommendations Advisory Partners Offer Customer Value Key Deliverables Executive and Strategy Advisory Leadership alignment on why this transition matters Unified SecOps vision and business case; Zero Trust and SOC modernization alignment; Stakeholder alignment across security, IT, and leadership Architecture and Design Advisory Future-ready architecture optimized for the Agentic SOC Target-state 2-tier data architecture; Dual-ingest routing decisions mapped to MITRE tactics; RBAC, retention, and access model design Detection Coverage and Gap Analysis Measurable detection maturity improvement Current-state MITRE ATT&CK coverage mapping; Gap analysis against 24 threat patterns; Detection improvement roadmap with priority recommendations SOC Operating Model Advisory Smooth analyst adoption with clear ownership Redesigned SOC workflows for unified portal; Incident triage and investigation playbooks; RACI for detection engineering, hunting, and platform ops Agentic SOC Readiness Preparation for AI-driven security operations MCP and agent architecture assessment; Custom agent development roadmap; IP + Human Orchestration + Agent operating model design Cost, Licensing and Value Advisory Transparent cost impact with strong business case Current vs. future cost analysis; Data tiering optimization recommendations; TCO and ROI modeling for leadership The conversion to managed services is evidence-based. Every professional services engagement produces findings — detection gaps, automation fragility, staffing shortfalls. Those findings are the most credible possible case for ongoing managed services. Managed Security Services The unified platform changes the managed security conversation. Partners are no longer selling “we watch your alerts 24/7.” They are selling an operating model where proprietary AI agents handle the repeatable work — enrichment, hunting, posture validation, response drafting — and human experts focus on the decisions that require judgment. This is where the competitive moat forms. The formula: IP + Human Orchestration + AI Agents = differentiated managed security. The unified platform enables this through: Multi-tenancy — the built-in multitenant portal eliminates the need for third-party management layers. Sentinel Data Lake — agents can query months of customer telemetry for behavioral analysis without cost constraints. Sentinel Graph — agents can traverse entity relationships to assess blast radius and map attack paths. MCP extensibility — partners can build agents that integrate with proprietary tools and customer-specific systems. Partners who build proprietary agents encoding their detection logic into the MCP framework will differentiate from partners who rely on out-of-box capabilities. The Securing AI Opportunity Organizations are deploying AI agents, copilots, and autonomous workflows across their businesses at an accelerating pace. Every AI deployment creates a new attack surface — prompt injection, data poisoning, agent hijacking, cross-plugin exploitation, unauthorized data access through agentic workflows. These are not theoretical risks. They are in the wild today. Partners who can help customers secure their AI deployments while also using AI to strengthen their SOC will command premium positioning. This requires a security platform that is itself AI Agent-ready — one that can deploy defensive agents at the same pace organizations deploy business AI. The unified Defender portal is that platform. Partners who position USX as “preparing your SOC for AI-driven security operations” will differentiate from partners who position it as “moving to a new portal.” Cost and Operational Benefits Better security architecture also costs less. This is not a contradiction — it is the natural result of putting the right data in the right tier. Benefit How It Works Eliminate low-value ingestion Identify and remove log sources that are never used for detections, investigations, or hunting. Immediately lowers analytics-tier costs without impacting security outcomes. Right-size analytics rules Disable unused rules, consolidate overlapping detections, and remove automation that does not reduce SOC effort. Pay only for processing that delivers measurable security value. Avoid SIEM/XDR duplication Many threats can be investigated directly in Defender XDR without duplicating telemetry into Sentinel. Stop re-ingesting data that Defender already provides. Tier data by detection need Store high-volume, hunt-oriented telemetry in the Data Lake at at least 20x lower cost. Promote only high-signal sources to the analytics tier. Full data fidelity preserved in both tiers. Reduce operational overhead Unified SIEM+XDR workflows in a single portal reduce tool switching, accelerate investigations, simplify analyst onboarding, and enable SOC teams to scale without proportional headcount increases. Improve detection quality The Defender correlation engine produces higher-fidelity incidents with fewer false positives. SOC teams spend less time triaging noise and more time on real threats. Competitive Positioning Partners need defensible talking points when customers evaluate third-party SIEM alternatives. The following advantages are factual, sourced from Microsoft’s transition documentation and platform capabilities — not marketing claims. No extra cost for transitioning — even for non-E5 customers. Third-party SIEM migrations involve licensing, data migration, detection rewrite, and integration rebuild costs. Native cross-domain correlation across Sentinel + Defender products into multi-stage incident graphs. Third-party SIEMs receive Microsoft logs as flat events — they lack the internal signal context, entity resolution, and product-specific intelligence that powers cross-domain correlation. Custom detections across SIEM + XDR — query both Sentinel and Defender XDR tables without ingesting Defender data into Sentinel. Eliminates redundant ingestion cost. Alert tuning extends to Sentinel — previously Defender-only capability, now applicable to Sentinel analytics rules. Net-new noise reduction. Unified entity pages — consolidated user, device, and IP address pages with data from both Sentinel and Defender XDR, plus global search across SIEM and XDR. Third-party SIEMs provide entity views from ingested data only. Built-in multi-tenancy for MSSPs — multitenant portal manages incidents, alerts, and hunting across tenants without third-party management layers. Try out the new GDAP capabilities in Defender portal. Industry validation: Microsoft’s SIEM+XDR platform has been recognized as a Leader by both Forrester (Security Analytics Platforms, 2025) and Gartner (SIEM Magic Quadrant, 2025). Summary: What Partners Should Take Away Topic Key Message Framing USX is a security architecture transformation, not a portal transition. Lead with detection capability, not cost savings. Platform foundation Sentinel Data Lake + Sentinel Graph + MCP/Agent Framework = the platform for the Agentic SOC. 4 investigation surfaces KQL → Graph → Notebooks → Agent/MCP. A maturity ladder from “we can query” to “we automate at machine speed.” Architecture 2-tier data model (analytics + Data Lake) with dual-ingest for critical sources. Cost savings are a side effect of good architecture. Transition complexity Analytics rules and automation rules. API schema changes. RBAC restructuring. Most customers need professional help. Partner engagement model Professional Services (transactional + transition execution + advisory) → Managed Services (MSSP). Competitive positioning No extra cost. Native correlation. Cross-domain detections. Built-in multi-tenancy. Capabilities third-party SIEMs cannot replicate. Partner differentiation IP + Human Orchestration + AI Agents. Partners who build proprietary agents on MCP have competitive advantage. Timeline March 31, 2027. Start now — phased transition with one telemetry domain first, then scale.2.1KViews4likes4CommentsIntroducing New Additions to Microsoft Sentinel Normalization and ASIM
TL;DR: New ASIM parsers for Azure Firewall, Key Vault, AWS CloudTrail (EC2, S3, IAM), and 10+ third-party products. Two new schemas — Asset Entities and AI Agent Events. Plus changelogs on GitHub and a heads-up on an upcoming breaking change in ProcessEvent parsers. What's New Security teams deal with logs from dozens of sources, each with its own schema. This painpoint makes it harder to write detections that work everywhere. The Advanced Security Information Model (ASIM) solves this by normalizing logs into a common schema, so a single analytic rule can cover a wide variety of sources without worrying about the source schema. Over the past few months, we have shipped a wave of new parsers, schemas, and improvements to ASIM. Here's everything you need to know. ASIM Parsers Azure Firewall Azure Firewall logs were previously only supported from the AzureDiagnostics table. Now, we support the dedicated resource-specific tables: Table ASIM Schema AZFWDnsQuery DNS AZFWNetworkRule NetworkSession AZFWApplicationRule WebSession Azure Key Vault Logs that are going to both AzureDiagnostics and resource-specific table AZKVAuditLogs are now normalized in the Audit Event schema. Azure Synapse SQL and Azure SQL Database Logs that are going to both AzureDiagnostics and resource-specific table SQLSecurityAuditEvents are now normalized to the Audit Event schema. Azure Traffic Analytics We have added support for the NTANetAnalytics table from Azure Traffic Analytics under the Network Session schema. AWS CloudTrail AWS CloudTrail previously only mapped to the Authentication schema. Now, you can correlate EC2, S3, and IAM activity through ASIM alongside your Azure telemetry: AuditEvent — Normalized EC2 events FileEvent — Normalized S3 events UserManagement — Normalized IAM and Cognito events Additional Parser Support We have also integrated the following third-party sources into ASIM: Authentication — Normalize sign-in and identity events for cross-source threat detection. CheckPoint Smart Defense Cisco IOS Cisco ISE Fortinet FortiGate Okta (OktaSystemLogs) Palo Alto — PAN-OS Palo Alto — Global Protect VMware vCenter Web Session — Normalize proxy and web gateway traffic. Cisco Umbrella Proxy Logs New ASIM Schemas We have created two new schemas to expand support new use cases. Asset Entities — Provides a normalized view of asset inventory data, enabling you to correlate files and assets across detections and investigations. AI Agent Events — Normalizes telemetry from AI-driven workflows and autonomous agents. Other Changes GitHub Changes Changelogs for every ASIM parser have been created to better help you understand updates and bug fixes we have implemented. As an example, here is the change log for the Authentication ASIM unifying parser. View Changelog Breaking Changes While aligning our ProcessEvent parsers to the official documentation, we found a naming inconsistency in the _Im_ProcessCreate function: Documentation specifies the parameter as targetusername_has Deployed parsers used targetusername What we changed: Both parameter names are now accepted. What you need to do: Update your analytic rules and queries to use targetusername_has. The legacy targetusername parameter will be deprecated in Summer 2026. What's Next We are continuing to expand ASIM with new parsers and schema capabilities to make detection authoring and log correlation even more powerful. BlueVoyant is also investing heavily in the ASIM ecosystem, building parsers that enhance detection coverage for their customers. See how they are using ASIM to operationalize detections. Want to get involved? Browse the ASIM parsers on GitHub, file issues, or contribute your own. We'd love to hear your feedback.520Views1like0CommentsMicrosoft 365 Developer E5 license lacking endpoints and device on defender portal
Dear Support Team, I am a microsoft certified trainer (MCT). I currently have a Microsoft 365 Developer E5 license assigned to my tenant. However, I have noticed that my Microsoft Defender portal (security.microsoft.com) is missing several critical features. For example, I cannot see the Endpoints or Devices menus, which is preventing me from implementing and testing Microsoft Defender for Endpoint. Additionally, my Azure tenant and Microsoft 365 tenant are separate. This has created challenges when configuring security services such as Microsoft Sentinel (SIEM), as certain prerequisites and integrations require configuration through the Microsoft Defender portal. Due to the missing Defender features, I am unable to complete the necessary setup. I would appreciate your assistance in understanding: Why the Endpoints and Devices sections are unavailable in my Defender portal despite having a Microsoft 365 Developer E5 license. Whether additional licensing, onboarding steps, or tenant configurations are required to enable Microsoft Defender for Endpoint features. How best to integrate or align my separate Azure and Microsoft 365 tenants to support services such as Microsoft Sentinel and Defender XDR. These issues are significantly impacting my ability to evaluate and implement Microsoft's security solutions. I would appreciate any guidance or recommendations to resolve them. Thank you for your assistance. Kind regards, [Your Name]34Views0likes0CommentsCampaign-Centric Hunting with Microsoft Defender XDR and Microsoft Sentinel
Phishing investigations usually start with one suspicious email. A user reports a message. An alert is generated. An analyst opens the email details, checks the sender, reviews the URL, and tries to understand whether the message is malicious. That is a normal starting point. However, in a real SOC investigation, one email is rarely the full story. Attackers usually operate in campaigns. They reuse sender infrastructure, similar subjects, URLs, payloads, templates, and delivery techniques. A single email may be only one part of a wider phishing or malware campaign targeting multiple users. This is why campaign-centric hunting is important. I wrote this article from the perspective of a SOC analyst who often needs to move quickly from a single suspicious email to the full campaign impact. The goal is simple: use Microsoft Defender XDR and Microsoft Sentinel together to understand who was targeted, what was delivered, who clicked, and what should be prioritized first. Why Campaign-Centric Hunting When investigating a phishing or malware email, analysts usually need to answer practical questions: How many users received messages from the same campaign? Were the messages blocked, junked, delivered, or remediated? Did any user click the URL? Did anyone click through a Safe Links warning? Were any priority or high-risk users affected? Was the email removed after delivery? Are there related Defender XDR or Sentinel incidents? If we only investigate one message, we may miss the bigger picture. Campaign-centric hunting helps the SOC move from this question: Is this email malicious? To this question: What is the full impact of this campaign? That shift is important because the response priority should be based on campaign impact, not only on a single alert. What Campaign Views Provides Campaign Views in Microsoft Defender for Office 365 help analysts investigate coordinated email attacks such as phishing and malware campaigns. From Campaign Views, analysts can review campaign-level information such as: Campaign name Campaign type Campaign subtype Targeted users Inboxed messages Clicked users Visited links Sender domains Sender IPs Payload URLs Delivery actions Campaign timeline Campaign flow This is useful during triage because it quickly shows whether an email is part of a wider attack. For example, one reported phishing message may look small at first. But if Campaign Views shows that the same campaign targeted 50 users, delivered messages to 15 inboxes, and had 2 users click the URL, the investigation becomes much more urgent. Where CampaignInfo Fits The CampaignInfo table gives analysts a KQL-based way to query campaign-related data. Some useful fields are: Field Purpose CampaignId Unique identifier for the campaign CampaignName Name of the campaign CampaignType Campaign category, such as Phish or Malware CampaignSubtype Additional context, such as brand being phished or malware family NetworkMessageId Unique identifier for the email message RecipientEmailAddress Recipient affected by the campaign Timestamp Time when the event was recorded For correlation, the most important field is usually: NetworkMessageId This field can help connect campaign data with other Defender XDR email tables, including: EmailEvents UrlClickEvents EmailPostDeliveryEvents EmailAttachmentInfo EmailUrlInfo This makes CampaignInfo a useful pivot table for campaign-level hunting. Important note: CampaignInfo is currently documented as Preview. Before using these queries in production analytics rules, validate the table availability, schema, and results in your own tenant. Practical Scenario An analyst receives a phishing alert in Microsoft Defender XDR. The alert is related to a user who received a suspicious email with a credential-harvesting URL. The analyst opens Campaign Views and sees that the message belongs to a wider phishing campaign. At that point, the investigation should not stop with the original user. The analyst should now ask: Who else received this campaign? How many messages were delivered? Which users clicked? Did any users click through the Safe Links warning? Were the messages removed after delivery? Are there related incidents in Microsoft Sentinel? The investigation flow could look like this: Start from Campaign Views in Microsoft Defender XDR. Identify the campaign details. Use CampaignInfo to list affected users and messages. Join with EmailEvents to validate delivery status. Join with UrlClickEvents to identify user interaction. Join with EmailPostDeliveryEvents to confirm remediation. Review related Microsoft XDR incidents in Microsoft Sentinel. Prioritize response based on campaign impact. Query 1: List Recent Campaigns The first query gives a simple overview of recent campaigns. CampaignInfo | where Timestamp > ago(14d) | summarize FirstSeen = min(Timestamp), LastSeen = max(Timestamp), AffectedUsers = dcount(RecipientEmailAddress), Messages = dcount(NetworkMessageId) by CampaignId, CampaignName, CampaignType, CampaignSubtype | order by LastSeen desc This helps analysts quickly identify campaigns that affected the organization during the selected period. Useful questions to ask from this output: Which campaigns are most recent? Which campaigns affected the most users? Are the campaigns phishing, malware, or spam? Is there a specific brand or malware family in the subtype? Are similar campaigns appearing repeatedly? Query 2: Understand Delivery Impact After identifying campaigns, the next step is to understand delivery impact. A campaign that was fully blocked is different from a campaign that reached user inboxes. let Campaigns = CampaignInfo | where Timestamp > ago(14d) | project CampaignId, CampaignName, CampaignType, CampaignSubtype, NetworkMessageId, RecipientEmailAddress; Campaigns | join kind=leftouter ( EmailEvents | where Timestamp > ago(14d) | project NetworkMessageId, RecipientEmailAddress, Subject, SenderFromAddress, SenderFromDomain, SenderIPv4, DeliveryAction, DeliveryLocation, ThreatTypes, DetectionMethods, Timestamp ) on NetworkMessageId, RecipientEmailAddress | summarize Messages = dcount(NetworkMessageId), AffectedUsers = dcount(RecipientEmailAddress), Subjects = make_set(Subject, 5), SenderDomains = make_set(SenderFromDomain, 10), SenderIPs = make_set(SenderIPv4, 10) by CampaignId, CampaignName, CampaignType, CampaignSubtype, DeliveryAction, DeliveryLocation | order by AffectedUsers desc, Messages desc This query helps separate campaigns that were blocked from campaigns that actually reached users. From a SOC perspective, delivered messages deserve closer attention, especially if they reached the inbox. Query 3: Identify Users Who Clicked Campaign URLs Delivery is important, but clicks usually increase the priority of the incident. This query joins campaign data with UrlClickEvents. let Campaigns = CampaignInfo | where Timestamp > ago(14d) | project CampaignId, CampaignName, CampaignType, CampaignSubtype, NetworkMessageId, RecipientEmailAddress; Campaigns | join kind=inner ( UrlClickEvents | where Timestamp > ago(14d) | project NetworkMessageId, AccountUpn, Url, ActionType, IsClickedThrough, ThreatTypes, DetectionMethods, IPAddress, Workload, ClickTime = Timestamp ) on NetworkMessageId | summarize FirstClick = min(ClickTime), LastClick = max(ClickTime), ClickEvents = count(), ClickedUsers = dcount(AccountUpn), ClickThroughUsers = dcountif(AccountUpn, IsClickedThrough == true), ClickedUrls = make_set(Url, 10), SourceIPs = make_set(IPAddress, 10) by CampaignId, CampaignName, CampaignType, CampaignSubtype | order by ClickThroughUsers desc, ClickedUsers desc, LastClick desc This query helps identify campaigns where users interacted with the payload. If a user clicked a phishing URL, the next step should usually include identity-focused investigation, such as reviewing sign-in activity, MFA status, session activity, and possible risky sign-ins. Query 4: Focus on Click-Through Events Safe Links may block access to a malicious site. In some cases, however, a user may continue through a warning page. Those cases should be reviewed carefully. let Campaigns = CampaignInfo | where Timestamp > ago(30d) | project CampaignId, CampaignName, CampaignType, CampaignSubtype, NetworkMessageId, RecipientEmailAddress; Campaigns | join kind=inner ( UrlClickEvents | where Timestamp > ago(30d) | where IsClickedThrough == true | project NetworkMessageId, AccountUpn, Url, ActionType, ThreatTypes, IPAddress, ClickTime = Timestamp ) on NetworkMessageId | project ClickTime, CampaignId, CampaignName, CampaignType, CampaignSubtype, AccountUpn, RecipientEmailAddress, Url, ActionType, ThreatTypes, IPAddress | order by ClickTime desc This is one of the most useful views during incident response. A click-through event does not automatically mean compromise, but it is a strong reason to investigate the user account further. Query 5: Confirm Post-Delivery Remediation A malicious message may be delivered first and removed later by ZAP, AIR, or manual remediation. This query joins CampaignInfo with EmailPostDeliveryEvents. let Campaigns = CampaignInfo | where Timestamp > ago(30d) | project CampaignId, CampaignName, CampaignType, CampaignSubtype, NetworkMessageId, RecipientEmailAddress; Campaigns | join kind=leftouter ( EmailPostDeliveryEvents | where Timestamp > ago(30d) | project NetworkMessageId, RecipientEmailAddress, RemediationTime = Timestamp, Action, ActionType, ActionTrigger, ActionResult, DeliveryLocation, SourceLocation ) on NetworkMessageId, RecipientEmailAddress | summarize RemediatedMessages = dcountif(NetworkMessageId, isnotempty(ActionType)), RemediationTypes = make_set(ActionType, 10), RemediationResults = make_set(ActionResult, 10), LastRemediation = max(RemediationTime) by CampaignId, CampaignName, CampaignType, CampaignSubtype | order by LastRemediation desc This helps answer a very important question: Were the delivered malicious messages actually removed? This is useful for both SOC triage and reporting because it shows not only detection, but also response. Query 6: Campaign Blast Radius Summary The following query combines campaign, delivery, click, and remediation data into one campaign-level view. let TimeRange = 30d; let Campaigns = CampaignInfo | where Timestamp > ago(TimeRange) | project CampaignId, CampaignName, CampaignType, CampaignSubtype, NetworkMessageId, RecipientEmailAddress; let Delivery = EmailEvents | where Timestamp > ago(TimeRange) | summarize DeliveryActions = make_set(DeliveryAction, 10), DeliveryLocations = make_set(DeliveryLocation, 10), DeliveredMessages = dcountif(NetworkMessageId, DeliveryAction =~ "Delivered"), JunkedMessages = dcountif(NetworkMessageId, DeliveryAction =~ "Junked"), BlockedMessages = dcountif(NetworkMessageId, DeliveryAction =~ "Blocked"), Subjects = make_set(Subject, 5), SenderDomains = make_set(SenderFromDomain, 10) by NetworkMessageId, RecipientEmailAddress; let Clicks = UrlClickEvents | where Timestamp > ago(TimeRange) | summarize ClickEvents = count(), ClickThroughEvents = countif(IsClickedThrough == true), FirstClick = min(Timestamp), LastClick = max(Timestamp), ClickedUrls = make_set(Url, 10) by NetworkMessageId; let Remediation = EmailPostDeliveryEvents | where Timestamp > ago(TimeRange) | summarize RemediationActions = make_set(ActionType, 10), LastRemediation = max(Timestamp) by NetworkMessageId, RecipientEmailAddress; Campaigns | join kind=leftouter Delivery on NetworkMessageId, RecipientEmailAddress | join kind=leftouter Clicks on NetworkMessageId | join kind=leftouter Remediation on NetworkMessageId, RecipientEmailAddress | summarize AffectedUsers = dcount(RecipientEmailAddress), Messages = dcount(NetworkMessageId), DeliveredMessages = sum(DeliveredMessages), JunkedMessages = sum(JunkedMessages), BlockedMessages = sum(BlockedMessages), TotalClickEvents = sum(ClickEvents), ClickThroughEvents = sum(ClickThroughEvents), Subjects = make_set(Subjects, 10), SenderDomains = make_set(SenderDomains, 10), ClickedUrls = make_set(ClickedUrls, 10), RemediationActions = make_set(RemediationActions, 10), LastClick = max(LastClick), LastRemediation = max(LastRemediation) by CampaignId, CampaignName, CampaignType, CampaignSubtype | extend SuggestedPriority = case( ClickThroughEvents > 0, "High", TotalClickEvents > 0, "Medium", DeliveredMessages > 0, "Medium", "Low" ) | order by SuggestedPriority asc, AffectedUsers desc, Messages desc This type of query can be useful during hunting sessions, incident review, and campaign reporting. The goal is not only to collect more data. The goal is to help the analyst decide what needs attention first. Correlating Campaign Activity with Microsoft Sentinel When Microsoft Defender XDR is connected to Microsoft Sentinel, incidents and alerts can be synchronized into the Sentinel incident queue. This allows the SOC to correlate campaign-related email activity with other security signals, such as: Suspicious sign-ins Identity alerts Endpoint alerts Cloud app activity OAuth consent activity Data exfiltration attempts Related Microsoft XDR incidents For example, if a user clicked a phishing URL, the SOC can then review whether the same user had suspicious sign-in activity shortly after the click. The following query is a simple starting point for reviewing Microsoft XDR incidents in Microsoft Sentinel. SecurityIncident | where TimeGenerated > ago(30d) | where ProviderName == "Microsoft XDR" | where Title has_any ("phish", "phishing", "email", "malware", "campaign") | summarize Incidents = count(), HighSeverity = countif(Severity == "High"), MediumSeverity = countif(Severity == "Medium"), Closed = countif(Status == "Closed"), Active = countif(Status == "Active") by bin(TimeGenerated, 1d) | order by TimeGenerated desc This query does not replace campaign hunting. It simply helps analysts understand how email-related activity is represented in the Sentinel incident queue. Suggested SOC Workflow A practical campaign-centric workflow could look like this: Step 1: Start from Campaign Views Review campaigns with delivered messages, clicked users, visited links, or high user impact. Step 2: Pivot to KQL Use CampaignInfo to list campaign-related messages and affected recipients. Step 3: Validate Delivery Join with EmailEvents to confirm whether messages were blocked, junked, delivered, or replaced. Step 4: Review User Interaction Join with UrlClickEvents to identify users who clicked URLs or clicked through Safe Links warnings. Step 5: Confirm Remediation Join with EmailPostDeliveryEvents to confirm whether delivered messages were removed after delivery. Step 6: Correlate in Sentinel Review related Microsoft XDR incidents and correlate with identity, endpoint, and cloud activity. Step 7: Decide Response Depending on the impact, the SOC may decide to: Escalate the incident Notify affected users Review user sign-ins Revoke user sessions Reset passwords Block sender domains or URLs Submit false negatives Create a watchlist for related indicators Tune analytics rules or response processes Suggested Priority Logic Not every campaign needs the same level of response. A simple triage model could be: Condition Suggested priority Campaign blocked before delivery Low Campaign delivered to junk Low to Medium Campaign delivered to inbox Medium Campaign delivered to multiple inboxes Medium to High User clicked URL High User clicked through warning High Priority account clicked High Click followed by suspicious sign-in Critical This model should be adapted to each organization’s risk profile and response process. Limitations and Things to Validate Before using this approach in production, validate the following: Defender for Office 365 Plan 2 availability Campaign Views permissions CampaignInfo table availability Defender XDR connector configuration Advanced hunting event streaming Field names in your environment Retention period Data latency Join behavior using NetworkMessageId Whether click events can be joined to email metadata in all cases One important limitation is that some URL click events may not join cleanly with email metadata. For example, clicks from Drafts or Sent Items may not have the same message metadata available for correlation. Also, because CampaignInfo is currently documented as Preview, I would avoid depending on it alone for critical production automation without testing and validation.103Views0likes0CommentsMicrosoft Sentinel data lake FAQ
Microsoft Sentinel data lake (generally available) is a purpose‑built, cloud‑native security data lake. It centralizes all security data in an open format, serving as the foundation for agentic defense, enhanced security insights, and graph-based enrichment. It offers cost‑effective ingestion, long‑term retention, and advanced analytics. In this blog we offer answers to many of the questions we’ve heard from our customers and partners. General questions What is the Microsoft Sentinel data lake? Microsoft has expanded its industry-leading SIEM solution, Microsoft Sentinel, to include a unified, security data lake, designed to help optimize costs, simplify data management, and accelerate the adoption of AI in security operations. This modern data lake serves as the foundation for the Microsoft Sentinel platform. It has a cloud-native architecture and is purpose-built for security—bringing together all security data for greater visibility, deeper security analysis, contextual awareness and agentic defense. It provides affordable, long-term retention, allowing organizations to maintain robust security while effectively managing budgetary requirements. What are the benefits of Sentinel data lake? Microsoft Sentinel data lake is purpose built for security offering flexible analytics, cost management, and deeper security insights. Sentinel data lake: Centralizes security data delta parquet and open format for easy access. This unified data foundation accelerates threat detection, investigation, and response across hybrid and multi-cloud environments. Enables data federation by allowing customers to access data in external sources like Microsoft Fabric, ADLS and Databricks from the data lake. Federated data appears alongside native Sentinel data, enabling correlated hunting, investigation, and custom graph analysis across a broader digital estate. Offers a disaggregated storage and compute pricing model, allowing customers to store massive volumes of security data at a fraction of the cost compared to traditional SIEM solutions. Allows multiple analytics engines like Kusto, Spark, and ML to run on a single data copy, simplifying management, reducing costs, and supporting deeper security analysis. Integrates with GitHub Copilot and VS Code empowering SOC teams to automate enrichment, anomaly detection, and forensic analysis. Supports AI agents via the MCP server, allowing tools like GitHub Copilot to query and automate security tasks. The MCP Server layer brings intelligence to the data, offering Semantic Search, Query Tools, and Custom Analysis capabilities that make it easier to extract insights and automate workflows. Provides streamlined onboarding, intuitive table management, and scalable multi-tenant support, making it ideal for MSSPs and large enterprises. The Sentinel data lake is designed for security workloads, ensuring that processes from ingestion to analytics meet evolving cybersecurity requirements. Is Microsoft Sentinel SIEM going away? No. Microsoft is expanding Sentinel into an AI powered end-to-end security platform that includes SIEM and new platform capabilities - Security data lake, graph-powered analytics and MCP Server. SIEM remains a core component and will be actively developed and supported. Getting started What are the prerequisites for Sentinel data lake? To get started: Connect your Sentinel workspace to Microsoft Defender prior to onboarding to Sentinel data lake. Once in the Defender experience see data lake onboarding documentation for next steps. Note: Sentinel is moving to the Microsoft Defender portal and the Sentinel Azure portal will be retired by March 31, 2027. I am a Sentinel-only customer, and not a Defender customer. Can I use the Sentinel data lake? Yes. You must connect Sentinel to the Defender experience before onboarding to the Sentinel data lake. Microsoft Sentinel is generally available in the Microsoft Defender portal, with or without Microsoft Defender XDR or an E5 license. If you have created a log analytics workspace, enabled it for Sentinel and have the right Microsoft Entra roles (e.g. Global Administrator + Subscription Owner, Security Administrator + Sentinel Contributor), you can enable Sentinel in the Defender portal. For more details on how to connect Sentinel to Defender review these sources: Microsoft Sentinel in the Microsoft Defender portal In what regions is Sentinel data lake available? For supported regions see: Geographical availability and data residency in Microsoft Sentinel | Azure Docs. Is there an expected release date for Microsoft Sentinel data lake in GCC, GCC-H, and DoD? While the exact date is not yet finalized, we plan to expand Sentinel data lake to the US Government environments. . How will URBAC and Entra RBAC work together to manage the data lake given there is no centralized model? Entra RBAC will provide broad access to the data lake (URBAC maps the right permissions to specific Entra role holders: GA/SA/SO/GR/SR). URBAC will become a centralized pane for configuring non-global delegated access to the data lake. For today, you will use this for the “default data lake” workspace. In the future, this will be enabled for non-default Sentinel workspaces as well – meaning all workspaces in the data lake can be managed here for data lake RBAC requirements. Azure RBAC on the Log Analytics (LA) workspace in the data lake is respected through URBAC as well today. If you already hold a built-in role like log analytics reader, you will be able to run interactive queries over the tables in that workspace. Or, if you hold log analytics contributor, you can read and manage table data. For more details see: Roles and permissions in the Microsoft Sentinel platform | Microsoft Learn Data ingestion and storage How do I ingest data into the Sentinel data lake? To ingest data into the Sentinel data lake, you can use existing Sentinel data connectors or custom connectors to bring data from Microsoft and third-party sources. Data can be ingested into the analytics tier or the data lake tier. Data ingested into the analytics tier is automatically mirrored to the lake (at no additional cost). Alternatively, data that is not needed in the analytics tier can be ingested directly into the data lake. Data retention is configured directly in table management, for both analytics retention and data lake storage. Note: Certain tables do not support data lake-only ingestion via either API or data connector UI. See here for more information: Custom log tables. What is Microsoft’s guidance on when to use analytics tier vs. the data lake tier? Sentinel data lake offers flexible, built-in data tiering (analytics and data lake tiers) to effectively meet diverse business use cases and achieve cost optimization goals. Analytics tier: Is ideal for high-performance, real-time, end-to-end detections, enrichments, investigation and interactive dashboards. Typically, high-fidelity data from EDRs, email gateways, identity, SaaS and cloud logs, threat intelligence (TI) should be ingested into the analytics tier. Data in the analytics tier is best monitored proactively with scheduled alerts and scheduled analytics to enable security detections Data in this tier is retained at no cost for up to 90 days by default, extendable to 2 years. A copy of the data in this tier is automatically available in the data lake tier at no extra cost, ensuring a unified copy of security data for both tiers. Data lake tier: Is designed for cost-effective, long-term storage. High-volume logs like NetFlow logs, TLS/SSL certificate logs, firewall logs and proxy logs are best suited for data lake tier. Customers can use these logs for historical analysis, compliance and auditing, incident response (IR), forensics over historical data, build tenant baselines, TI matching and then promote resulting insights into the analytics tier. Customers can run full Kusto queries, Spark Notebooks and scheduled jobs over a single copy of their data in the data lake. Customers can also search, enrich and promote data from the data lake tier to the analytics tier for full analytics. For more details see documentation. What does it mean that a copy of all new analytics tier data will be available in the data lake? When Sentinel data lake is enabled, a copy of all new data ingested into the analytics tier is automatically duplicated into the data lake tier. This means customers don’t need to manually configure or manage this process, every new log or telemetry added to the analytics tier becomes instantly available in the data lake. This allows security teams to run advanced analytics, historical investigations, and machine learning models on a single, unified copy of data in the lake, while still using the analytics tier for real-time SOC workflows. It’s a seamless way to support both operational and long-term use cases—without duplicating effort or cost. What is the guidance for customers using data federation capability in Sentinel data lake? Starting April 1, 2026, federate data from Microsoft Fabric, ADLS, and Azure Databricks into Sentinel data lake. Use data federation when data is exploratory, infrequently accessed, or must remain at source due to governance, compliance, sovereignty, or contractual requirements. Ingest data directly into Sentinel to unlock full SIEM capabilities, always-on detections, advanced automation, and AI‑driven defense at scale. This approach lets security teams start where their data already lives — preserving governance, then progressively ingest data into Sentinel for full security value. Is there any cost for retention in the analytics tier? Analytics ingestion includes 90 days of interactive retention, at no additional cost. Simply set analytics retention to 90 days or less. Analytics retention beyond 90 days will incur a retention cost. Data can be retained longer within the data lake by using the “total retention” setting. This allows you to extend retention within the data lake for up to 12 years. While data is retained within the analytics tier, there is no charge for the mirrored data within the lake. Retaining data in the lake beyond the analytics retention period incurs additional storage costs. See documentation for more details: Manage data tiers and retention in Microsoft Sentinel | Microsoft Learn What is the guidance for Microsoft Sentinel Basic and Auxiliary Logs customers? If you previously enabled Basic or Auxiliary Logs plan in Sentinel: You can view Basic Logs in the Defender portal but manage it from the Log Analytics workspace. To manage it in the Defender portal, you must change the plan from Basic to Analytics. Once the table is transitioned to the analytics tier, if desired, it can then be transitioned to the data lake. Existing Auxiliary Log tables will be available in the data lake tier for use once the Sentinel data lake is enabled. Billing for these tables will automatically switch to the Sentinel data lake meters. Microsoft Sentinel customers are recommended to start planning their data management strategy with the data lake. While Basic and Auxiliary Logs are still available, they are not being enhanced further. Sentinel data lake offers more capabilities at a lower price point. Please plan on onboarding your security data to the Sentinel data lake. Azure Monitor customers can continue to use Basic and Auxiliary Logs for observability scenarios. What happens to customers that already have Archive logs enabled? If a customer has already configured tables for Archive retention, existing retention settings will not change and will be automatically inherited by the Sentinel data lake. All data, including existing data in archive retention will be billed using the data lake storage meter, benefiting from 6x data compression. However, the data itself will not move. Existing data in archive will continue to be accessible through Sentinel search and restore experiences: o Data will not be backfilled into the data lake. o Data will be billed using the data lake storage meter. New data ingested after enabling the data lake: o Will be automatically mirrored to the data lake and accessible through data lake explorer. o Data will be billed using the data lake storage meter. Example: If a customer has 12 months of total retention enabled on a table, 2 months after enabling ingestion into the Sentinel data lake, the customer will still have access to 10 months of archived data (through Sentinel search and restore experiences), but access to only 2 months of data in the data lake (since the data lake was enabled). Key considerations for customers that currently have Archive logs enabled: The existing archive will remain, with new data ingested into the data lake going forward; previously stored archive data will not be backfilled into the lake. Archive logs will continue to be accessible via the Search and Restore tab under Sentinel. If analytics and data lake mode are enabled on table, which is the default setting for analytics tables when Sentinel data lake is enabled, all new data will be ingested into the Sentinel data lake. There will only be one storage meter (which is data lake storage) going forward. Archive will continue to be accessible via Search and Restore. If Sentinel data lake-only mode is enabled on table, new data will be ingested only into the data lake; any data that’s not already in the Sentinel data lake won’t be migrated/backfilled. Only data that was previously ingested under the archive plan will be accessible via Search and Restore. What is the guidance for customers using Azure Data Explorer (ADX) alongside Microsoft Sentinel? Some customers might have set up ADX cluster for their DIY lake setup. Customers can choose to continue using that setup and gradually migrate to Sentinel data lake for new data that they want to manage. The lake explorer will support federation with ADX to enable the customers to migrate gradually and simplify their deployment. What happens to the Defender XDR data after enabling Sentinel data lake? By default, Defender XDR tables are available for querying in advanced hunting, with 30 days of analytics tier retention included with the XDR license. To retain data beyond this period, an explicit change to the retention setting is required, either by extending the analytics tier retention or the total retention period. You can extend the retention period of supported Defender XDR tables beyond 30 days and ingest the data into the analytics tier. For more information see Manage XDR data in Microsoft Sentinel. You can also ingest XDR data directly into the data lake tier. See here for more information. A list of XDR advanced hunting tables supported by Sentinel are documented here: Connect Microsoft Defender XDR data to Microsoft Sentinel | Microsoft Learn. KQL queries and jobs Is KQL and Notebook supported over the Sentinel data lake? Yes, via the data lake KQL query experience along with a fully managed Notebook experience which enables spark-based big data analytics over a single copy of all your security data. Customers can run queries across any time range of data in their Sentinel data lake. In the future, this will be extended to enable SQL query over lake as well. Note: Triggering a KQL job directly via an API or Logic App is not yet supported but is on the roadmap. Why are there two different places to run KQL queries in Sentinel experience? Advanced hunting queries both XDR and analytics tables, with compute cost included. Data lake explorer only queries data in the lake and incurs a separate compute cost. Consolidating advanced hunting and KQL explorer user interfaces is on the roadmap. This will provide security analysts a unified query experience across both analytics and data lake tiers. Where is the output from KQL jobs stored? KQL jobs are written into existing or new custom tables in the analytics tier. Is it possible to run KQL queries on multiple data lake tables? Yes, you can run KQL interactive queries and jobs using operators like join or union. Can KQL queries (either interactive or via KQL jobs) join data across multiple workspaces? Security teams can run multi-workspace KQL queries for broader threat correlation Pricing and billing How does a customer pay for Sentinel data lake? Billing is automatically enabled at the time of onboarding based on Azure Subscription and Resource Group selections. Customers are then charged based on the volume of data ingested, retained, and analyzed (e.g. KQL Queries and Jobs). See Sentinel pricing page for more details. 2. What are the pricing components for Sentinel data lake? Sentinel data lake offers a flexible pricing model designed to optimize security coverage and costs. At a high level, pricing is based on the volume of data ingested/processed, the volume of data retained, and the volume of data processed. For specific meter definitions, see documentation. 3. How does the business model for Sentinel SIEM change with the introduction of the data lake? There is no change to existing Sentinel analytics tier ingestion business model. Sentinel data lake has separate meters for ingestion, storage and analytics. 4. What happens to the existing Sentinel SIEM and related Azure Monitor billing meters when a customer onboards to Sentinel data lake? When a customer onboards to the Sentinel data lake, nothing changes with analytic ingestion or retention. Customers using data archive and Auxiliary Logs will automatically transition to the new data lake meters. How does data lake storage affect cost efficiency for high volume data retention? Sentinel data lake offers cost-effective, long-term storage with uniform data compression of 6:1 across all data sources, applicable only to data lake storage. Example: For 600GB of data stored, you are only billed for 100GB compressed data. This approach allows organizations to retain greater volumes of security data over extended periods cost-effectively, thereby reducing security risks without compromising their overall security posture. here How “Data Processing” billed? To support the ingestion and standardization of diverse data sources, the Data Processing feature applies a $0.10 per GB (US East) charge for all data ingested into the data lake. This feature enables a broad array of transformations like redaction, splitting, filtering and normalization. The data processing charge is applied per GB of uncompressed data Note: For regional pricing, please refer to the “Data processing” meter within the Microsoft Sentinel Pricing official documentation. Does “Data processing” meter apply to analytics tier data mirrored in the data lake? No. Data processing charge will not be applied to mirrored data. Data mirrored from the analytic tier is not subject to either data ingestion or processing charges. How is retention billed for tables that use data lake-only ingestion & retention? Sentinel data lake decouples ingestion, storage, and analytics meters. Customers have the flexibility to pay based on how data is retained and used. For tables that use data lake‑only ingestion, there is no included free retention—unlike the analytics tier, which includes 90 days of analytics retention. Retention charges begin immediately once data is stored in the data lake. Data lake storage billing is based on compressed data size rather than raw ingested volume, which significantly reduces storage costs and delivers lower overall retention spend for customers. Does data federation incur charges? Data federation does not generate any ingestion or storage fees in Sentinel data lake. Customers are billed only when they run analytics or queries on federated data, with charges based on Sentinel data lake compute and analytics meters. This means customers pay solely for actual data usage, not mere connectivity. How do I understand Sentinel data lake costs? Sentinel data lake costs driven by three primary factors: how much data is ingested, how long that data is retained, and how the data is used. Customers can flexibly choose to ingest data into the analytics tier or data lake tier, and these architectural choices directly impact cost. For example, data can be ingested into the analytics tier—where commitment tiers help optimize costs for high data volumes—or ingested data directly into the Sentinel data lake for lower‑cost ingestion, storage, and on‑demand analysis. Customers are encouraged to work with their Microsoft account team to obtain an accurate cost estimate tailored to their environment. See Sentinel pricing page to understand Sentinel pricing. How do I manage Sentinel data lake costs? Built-in cost management experiences help customers with cost predictability, billing transparency, and operational efficiency. Reports provide customers with insights into usage trends over time, enabling them to identify cost drivers and optimize data retention and processing strategies. Set usage-based alerts on specific meters to monitor and control costs. For example, receive alerts when query or notebook usage passes set limits, helping avoid unexpected expenses and manage budgets. See our Sentinel cost management documentation to learn more. If I’m an Auxiliary Logs customer, how will onboarding to the Sentinel data lake affect my billing? Once a workspace is onboarded to Sentinel data lake, all Auxiliary Logs meters will be replaced by new data lake meters. Do we charge for data lake ingestion and storage for graph experiences? Microsoft Sentinel graph-based experiences are included as part of the existing Defender and Purview licenses. However, Sentinel graph requires Sentinel data lake and specific data sources to build the underlying graph. Enabling these data sources will incur ingestion and data lake storage costs. Note: For Sentinel SIEM customers, most required data sources are free for analytics ingestion. Non-entitled sources such as Microsoft Entra ID logs will incur ingestion and data lake storage costs. How is Entra asset data and ARG data billed? Data lake ingestion charges of $0.05 per GB (US EAST) will apply to Entra asset data and ARG data. Note: This was previously not billed during public preview and is billed since data lake GA. To learn more, see: https://learn.microsoft.com/azure/sentinel/datalake/enable-data-connectors When a customer activates Sentinel data lake, what happens to tables with archive logs enabled? To simplify billing, once the data lake is enabled, all archive data will be billed using the data lake storage meter. This provides consistent long-term retention billing and includes automatic 6x data compression. For most customers, this change results in lower long‑term retention costs. However, customers who previously had discounted archive retention pricing will not automatically receive the same discounts on the new data lake storage meters. In these cases, customers should engage their Microsoft account team to review pricing implications before enabling the Sentinel data lake. Thank you Thank you to our customers and partners for your continued trust and collaboration. Your feedback drives our innovation, and we’re excited to keep evolving Microsoft Sentinel to meet your security needs. If you have any questions, please don’t hesitate to reach out—we’re here to support you every step of the way. Learn more: Get started with Sentinel data lake today: https://aka.ms/Get_started/Sentinel_datalake Microsoft Sentinel AI-ready platform: https://aka.ms/Microsoft_Sentinel Sentinel data lake videos: https://aka.ms/Sentineldatalake_videos Latest innovations and updates on Sentinel: https://aka.ms/msftsentinelblog Sentinel pricing page: https://aka.ms/MicrosoftSentinel_Pricing6.3KViews1like9CommentsPending Approval/Provisioning for Microsoft Defender XDR Lab/Trial Environment
Hello Microsoft Community Team, On June 26, 2026, our organization applied for a Microsoft 365 Developer Environment / Free Trial to support evaluation of the Microsoft Defender XDR Lab environment. To date, the environment has not been provisioned, and we have not received any status updates or confirmation. Impact: Current Status: We are currently utilizing our production environment to test project capabilities, which poses risks and limitations. Future Intent: Our organization plans to transition to a full, paid Business/Enterprise purchase immediately upon proving the platform’s benefits. Urgency: This delay is stalling our evaluation phase. We urgently need this environment onboarded and activated so we can proceed with deployment tests and subsequent procurement. Request: Please review the status of our registration and expedite the onboarding/provisioning of this developer environment. Thank you for your prompt assistance.22Views0likes0CommentsThe Worm in the Supply Chain: How Defender for Endpoint and Sentinel for SAP BTP Caught Shai-Hulud
On 29 April 2026, malicious versions of multiple SAP ecosystem npm packages were briefly published, creating a supply-chain exposure for SAP Cloud Application Programming (CAP) development environments and CI/CD pipelines. For a brief window that morning, affected developers have executed a credential-stealing payload on a workstation or, in higher-impact cases, within a CI/CD pipeline. SAP developers don't usually think of themselves as a juicy npm target. CAP, BTP, Fiori - that's enterprise turf, not crypto-stealer type territory – Until it is. Join me for the ride. See our latest click-video for an even more dynamic experience of SAP compromises. Affected packages and scope Four official npm packages from the SAP development ecosystem were published in malicious versions that day. Security researchers are calling the campaign "Mini Shai-Hulud" - the little cousin of the worm family that has been chewing its way through open-source registries for months. So, the "mini" part is a generous description in my opinion. Shai-Hulud has wriggled directly into the SAP supply chain, and that detail alone deserves a pause... SAP CAP is now interesting enough to have become a target. Four packages, all wearing legitimate SAP branding, all quietly swapped for evil twins: @cap-js/sqlite v2.2.2 @cap-js/postgres v2.2.2 @cap-js/db-service v2.10.1 mbt v1.2.48 These packages are not peripheral dependencies. The @cap-js/* modules are part of the SAP CAP Model used across custom development on SAP BTP, while mbt is the Cloud MTA Build Tool commonly embedded in CI/CD workflows that package and deploy Multi-Target Applications to BTP and on-premises environments. At roughly 930,000 weekly downloads, the combined exposure created meaningful downstream attack surface. The good news: SAP spotted the compromise fast, yanked the bad versions, and shipped clean replacements. The official guidance lives in SAP Security Note 3747787 - which carries the list of indicators of compromise, file hashes, and mitigation steps. Enough theory and evidence talk! Now, SHOW ME the detection! When the worm stirs beneath the sand, weak defenses vanish first. Observed telemetry in Microsoft Security products See below excerpt of Microsoft Defender for Endpoint from a compromised developer machine. The worm was neutralized immediately. Check the detection time (same day of release): Windows Defender AV detected malware ToString: DefenderDetection: File: /Users/User***/Projects/dara-api-manager-ui/node_modules/mbt/File***.js, Sha256: *** [Trojan:JS/SPchnStlr.BB], BlockingStatus: Prevented, BlockingStatusPriority: 900 DetectionTime: 2026-04-29 11:52:11Z DetectorName: Microsoft.Cyber.ObservationDetectors.DefenderConcreteDetector Observations (2): DefenderObservation Description: Defender detected and quarantined 'Trojan:JS/SPchnStlr.BB' in file 'File***.js' ThreatCategory = Trojan, ThreatFamily = SPchnStlr, How the Threat Actors Operationalized the Stolen Data The compromise allowed harvesting GitHub tokens, AWS/Azure/GCP secrets, npm credentials, Kubernetes config, SSH keys, .npmrc and .git-credentials files, and CI/CD environment variables. The hackers created a public GitHub repository on the victim’s own account, tagged with the description “A Mini Shai-Hulud has Appeared“ to exfiltrate their reaping. Within hours, more than a thousand such repositories were visible in public GitHub search. For additional views on the topic check out the blogs of our Sentinel for SAP partners: Onapsis, Pathlock, and SecurityBridge. Containment and Impact Reduction If you were not as lucky as the developer using Defender for Endpoint and VS Code, you need end to end monitoring of your landscape in and around SAP. Once the worm is loose with cloud tokens it may appear in various unexpected places. Microsoft Sentinel Solution for SAP covers your ERP crown jewels, your SAP BTP landscape and allows informed correlation with the rest of your IT estate. Microsoft’s correlation engine: ensures traceability automatic attack disruption and just-in-time hardening of potential attack paths. Developers using the cloud-based IDE SAP Business Application Studio are out of reach by Defender for Endpoint but profit from threat monitoring through Sentinel for SAP BTP integrating SAP BTP’s malware scanner the same way. See this in action in this click-video and in below screenshot. SOC analysts get actionable insights and tailored guidance from Security Copilot once SAP BTP signals are added to the Microsoft incident graph - no matter where the threat involving SAP originates from. Getting Started with Sentinel Solution for SAP Rollout of Sentinel for SAP BTP can happen immediately. Learn more from our deployment guide. Check out the security content reference for more info out-of-the-box detections. Sentinel for SAP which covers your ERP solutions and more, requires configuration of SAP Integration Suite as intermediary step. Learn more from our deployment guide. Check out the security content reference for more info out-of-the-box detections Final Words This incident illustrates how far the SAP BTP attack surface now extends and why patching alone is insufficient once malicious code reaches developer tooling and build infrastructure. Effective defense also requires telemetry, correlation, and response coverage across SAP and non-SAP environments. See you out there folks! #Kudos to Mahesh Mandva and Cameron Gardiner on riding shai-holud with me. Feel free to reach out to talk more about SAP Cyber Security. Cheers, Martin Useful Links SAP Note 3747787 with mitigation guide Mini Shai-Hulud: Multi-Ecosystem Developer Supply Chain Attack – Lab Space Click-Demo for SAP Cyber Security with Microsoft Sentinel for SAP Security Content | Microsoft Learn Sentinel for SAP BTP Security Content | Microsoft Learn407Views0likes0CommentsAmazon Security Lake Integration with Microsoft Sentinel: Parquet at the Gates
This blog has been jointly published by ChitreshPandit and Arijit_Paul. In this blog post, we explore how centralized AWS Security Lake data can be transformed and streamed into Microsoft Sentinel using an AWS Lambda-based pipeline for near real-time ingestion. A story of cost, control, and custom engineering at cloud scale Every organization strives to design its security architecture to help address architectural complexity and support cost‑aware, scalable designs, depending on workload characteristics and implementation choices. Across multiple customer environments, we increasingly see a consistent pattern emerge - security telemetry from various AWS services is not ingested in isolation, but is deliberately centralized into Amazon Security Lake. This approach reflects a maturity in design, where organizations move beyond service-level integrations and instead adopt a unified data strategy. Amazon Security Lake enables this by aggregating security data from services such as Route53, WAF, Kubernetes, and others into a centralized, governed repository. The data is normalized and stored in an open, analytics-friendly format, often leveraging Apache Parquet, a columnar storage format optimized for large-scale processing and cost-efficient storage. This can allow organizations to retain high volumes of security data while maintaining performance and may help optimize storage efficiency in certain scenarios, depending on data volume, retention policies, and analytics patterns. However, this architectural choice introduces a new consideration. Microsoft Sentinel, when integrated into such environments, typically expects ingestion through connector-driven pipelines and streaming event models. In contrast, Security Lake represents a batch-oriented, schema-driven data platform. Rather than treating this as a constraint, it becomes an opportunity to rethink how data should flow between these systems. In this blog, we explore how a streaming bridge architecture can be implemented to align Amazon Security Lake with Microsoft Sentinel’s ingestion model. The approach leverages a combination of AWS Lambda and event-driven patterns to process data as it lands in Amazon S3, transforms it into a Sentinel-compatible format, and streams it through Azure Event Hub into Microsoft Sentinel using Data Collection Rules (DCRs) and Data Collection Endpoints (DCEs). This approach can support lower‑latency ingestion patterns when configured appropriately and compared to batch‑only processing models while preserving the lake-first architecture, allowing organizations to support analytics, visualization, and threat hunting activities using the ingested data. For demonstration, this implementation focuses on ingesting the following data sources from Amazon Security Lake: Amazon EKS audit and runtime events Route 53 DNS query logs AWS WAF access logs AWS Lambda execution activity Amazon S3 access events Note: The solution and code provided in this blog are not an officially supported Microsoft solution and do not guarantee performance, reliability, availability, or support. No service-level agreements (SLAs) are included. Readers are responsible for validating suitability for their environment. Before we begin, let us briefly discuss Amazon Security Lake and the Parquet format. Amazon Security Lake and the Parquet Constraint Amazon Security Lake provides a centralized, S3-backed repository for security telemetry, addressing fragmentation across services, accounts, and regions. Instead of service-level ingestion pipelines, logs from sources such as EKS, Route 53, WAF, Lambda, and S3 are aggregated into a single, governed data layer, enabling consistent visibility, separation of duties, and cost-efficient storage at scale. This data is stored in Apache Parquet, a columnar format optimized for analytics—delivering high compression, schema evolution, and efficient, selective reads across engines like Athena and Spark. Microsoft Sentinel operates on a streaming ingestion model expecting JSON payloads, source-specific pipelines, and continuous event flows. In lake-first architectures, reintroducing service-level ingestion is neither practical nor efficient. The requirement, therefore, is to bridge the two models -preserving Parquet for storage while enabling event-driven ingestion at the point of consumption. In the following sections, we will create an Event Hub, Configure the Lambda function and once the streaming is configured in Event Hub we will configure the Data Collection Rules, Data Collection endpoints, Data Collection Rule associations to configure the ingestion pipeline from Event Hub to Sentinel. Pre-requisites: Log Analytics workspace where you have at least contributor rights. Your Log Analytics workspace needs to be linked to a dedicated cluster or to have a commitment tier. Event Hubs namespace that permits public network access. If public network access is disabled, ensure that "Allow trusted Microsoft services to bypass this firewall" is set to "Yes." Event Hubs with events flowing in. In this implementation, events are sent to Event Hubs by the AWS Lambda function configured in the steps below - no manual event sending is required. Appropriate IAM roles in AWS accounts to configure SQS queue, Lambda function, IAM policies, etc. The Event Hub To begin, we need to create an Event Hub. Azure Event Hubs is a fully managed, high-throughput event ingestion and streaming platform, designed to support high event volumes within documented service limits, subject to configuration and tier selection. SKUs (Standard vs Premium) Standard tier is a throughput-unit (TU) based model, where capacity is explicitly controlled and shared across the namespace. Premium tier provides isolated compute and memory via processing units (PU), which can provide more consistent performance characteristics and higher throughput capacity compared to shared models, depending on workload. Additionally, Azure Event Hubs offers a Dedicated tier, which is a fully isolated, single-tenant cluster for enterprise-scale workloads with higher throughputs (at significantly higher cost). Throughput Characteristics (Standard Tier) A single Throughput Unit (TU) provides: Ingress: up to 1 MB/s Egress: up to 2 MB/s A Standard namespace can scale to a maximum of 40 TUs, giving: Max ingress: 40 MB/s Max egress: 80 MB/s All Event Hubs, partitions, and consumers within the namespace share this TU capacity, making it a central ingestion buffer for streaming pipelines rather than a per-source scaling model. Which SKU to choose: For ingress/egress up to 40/80 MB/s, a Standard SKU may be suitable. Higher volumes may warrant consideration of Premium, depending on workload requirements. Azure Event Hubs Concepts: Event Hub Namespace: A logical container that provides the endpoint, networking boundary, and shared throughput capacity (TUs/PUs) for all Event Hubs within it. Event Hub: An individual event stream (topic) within the namespace where events are ingested, stored, and read in a partitioned, ordered manner for parallel processing. Reference: Event Hubs features and terminology - Azure Event Hubs Creating Event Hub namespace, Event Hub entities, and considerations: Create the Azure Event Hub Namespace, in this example, we create a Standard Event Hub with minimum 1 TU with Auto Inflate on enabling scaling upto the maximum 40 TUs. Note the maximum Throughput Units should be based on the size of the logs expected per second from Amazon Security Lake. Since the Event Hub will be used to ingest data in Azure Monitor (Sentinel Workspace) please check the Supported Regions. Ensure Event Hub region alignment with Microsoft Sentinel. Configure TU based on expected ingestion rate Once the Event Hub is created, enable the Local Authentication, since the AWS Lambda code we are using (discussed in the next section) uses Shared access signature to connect to the Azure Event Hub. See Shared Access Signatures. Create individual Event Hubs within the namespace created in Step 1. One Event Hub is required per log type - for example, if Amazon Security Lake includes EKS, Route 53, and WAF logs, then three Event Hub entities should be created. Why this matters: Easier DCR mapping Avoids schema conflicts Create a consumer group in each Event Hub we created in the previous step. Consumer Group is an independent view of an event stream that allows multiple applications to read the same events separately, each maintaining its own position (offset) in the stream. Event Hub Features Avoid using the $Default consumer group. With the Event Hub namespace, entities, and consumer groups in place, the receiving end of the pipeline is ready. The next step is to configure the AWS Lambda function that will translate Security Lake's Parquet files into the JSON events that Event Hub expects. The AWS Lambda Function (SQS-driven Parquet → JSON) In this architecture, AWS Lambda is the translation layer, not an ingestion source. Instead of being invoked directly by S3, Security Lake emits S3 object‑creation notifications into Amazon SQS, and SQS becomes the Lambda trigger. This decoupling is intentional: SQS absorbs bursts of newly written Parquet objects which can help increase resilience of the ingestion pipeline under bursty or variable workloads. Once triggered, the Lambda function processes each queued S3 notification end‑to‑end: it derives the log type from the S3 object key, downloads the Parquet file, converts each row into a discrete JSON event, and forwards the resulting events to Azure Event Hub in batches for downstream ingestion via DCR/DCE. A practical consideration is event size management. Some sources - especially EKS audit telemetry - can carry large, nested fields that are not always useful for Sentinel analytics. For those log types, the Lambda function drops non‑essential fields during transformation to keep each event within Azure ingestion constraints; oversized fields can exceed the 64 KB Azure Monitor field size limit and disrupt ingestion (Fields more than 64 KB will be truncated in Log Analytics Workspace). Solution Architecture The diagram above illustrates the end-to-end data flow: logs originate in AWS, are converted by AWS Lambda, streamed through Azure Event Hub, and stored in Microsoft Sentinel. Amazon Security Lake writes Parquet files to a centralized S3 bucket as logs arrive from source services (CloudTrail, EKS, VPC Flow, WAF, Route 53, and others). Amazon SQS receives S3 event notifications from Security Lake and queues them as Lambda triggers. AWS Lambda picks up each SQS message, identifies the log source from the S3 object key, downloads the Parquet file, converts each row to JSON, and forwards the events to Azure Event Hub. Azure Event Hub receives the JSON events and makes them available for ingestion into Microsoft Sentinel via a Data Collection Rule (DCR). Microsoft Sentinel ingests the data into a custom log table, where it is available for detection rules, hunting queries, and dashboards. The full Lambda function code is available in the GithubRepository-LambdaFunction. Refer to the readme for more details. How the Lambda Function works At a high level, the Lambda function performs four things in sequence for every file it processes: Identify the log source: Security Lake organizes files under a structured S3 key path that includes the log type (for example, CLOUD_TRAIL_MGMT, EKS_AUDIT, VPC_FLOW). The function reads this key path to determine which log source the file belongs to, and routes it to the corresponding Azure Event Hub entity. Files that cannot be matched to a known log type are skipped and logged as warnings. Download and decompress the Parquet file: The function streams the Parquet file from S3 directly to local Lambda storage rather than loading it entirely into memory. This keeps memory consumption bounded regardless of file size. Where Security Lake uses gzip compression, the function decompresses automatically before processing. Convert Parquet rows to JSON: Each row in the Parquet file is read in batches using PyArrow and converted to a JSON object. Parquet columns can carry data types - such as NumPy scalars, nested arrays, and high-precision timestamps — that are not natively serializable to JSON. The function handles these type conversions before serialization, ensuring clean output that Sentinel's ingestion pipeline can accept without errors. Forward events to Azure Event Hub: Converted JSON events are sent to the respective Azure Event Hub entity in batches, with each row becoming a discrete event. The function respects Event Hub's payload size ceiling, handles throttling responses gracefully using retry logic with exponential backoff, and marks each processed file in S3 metadata to prevent duplicate ingestion on retry. Tuning the Lambda Messages at cloud scale are unforgiving. Memory, timeout, batch size, and retry behavior - each decision determines whether the messenger would keep up or fall behind. Several configuration changes are required beyond the default Lambda settings to make this function production-ready. Runtime and Dependencies - Lambda Layer The function depends on three libraries not available in Lambda's default Python runtime: pyarrow (for Parquet reading), pandas (for type handling), and azure-eventhub (for Event Hub connectivity). These are packaged as an AWS Lambda Layer and attached to the function, keeping the deployment package clean and the layer reusable across function versions. Step-by-step instructions for packaging the dependencies, creating the S3 bucket, publishing the Layer, and deploying the function are available in the GithubRepository-LambdaLayer. Secrets Management - AWS Secrets Manager The Azure Event Hub connection strings - one per log type - are sensitive credentials that must not be stored in environment variables in plaintext. The function retrieves them at cold start from AWS Secrets Manager, using a single secret ARN passed as a Lambda environment variable (SECRET_ARN). The secret is stored as a JSON object with each log type as a key. The full secret structure and configuration steps are available in the GithubRepository-SecretsManager. IAM Permissions The Lambda execution IAM role requires scoped permissions across S3, Secrets Manager, SQS, and CloudWatch Logs. Full IAM policy JSON files following the principle of least privilege are available in the GithubRepository-IAMPolicies. Deployment instructions for the IAM Policies are available in the IAM Policies README. Memory Configuration A starting configuration of 1,792 MB is recommended - this is the threshold at which Lambda may allocate a full vCPU. For environments with high log volumes or large Parquet files, increasing to 2,048 MB provides headroom for concurrent batch processing. Tune further based on observed execution durations in CloudWatch Metrics. Timeout Configuration The default Lambda timeout of 3 seconds is insufficient for Parquet processing at scale. The function must download a file from S3, process it in batches, and flush all events to Event Hub - a sequence that can take tens of seconds for larger Security Lake files. A timeout of 5 minutes (300 seconds) is recommended as a starting point, with adjustment based on observed execution durations in CloudWatch Metrics. SQS Trigger Configuration The SQS queue connected to Security Lake S3 event notifications is configured as the trigger for the Lambda function. This enables automatic invocation of the Lambda function whenever Security Lake writes a new Parquet file to S3. Validate Event Hub Ingestion At this point, events should be streaming into each Event Hub. To validate, open a specific Event Hub from the Azure Portal and navigate to the Overview page. You should see active metrics across Requests, Messages, and Throughput, confirming that the Lambda function is successfully forwarding Security Lake events. In the upcoming sections, we will follow the documentation to ingest events from Event Hub to Azure Monitor. Before we begin, please collect the required information as stated here to have resource ID's and other details ready for configuration of DCR and Data Collection Endpoint. Also create a user assigned managed identity (UAMI), since the DCRs in this setup use a UAMI that should be granted the required permissions on the Event Hub Namespace to Receive Events. To grant the Azure Event Hubs Data Receiver role to the user-assigned managed identity, follow the instructions here. Creating Tables in Log Analytics Workspace As mentioned in the Overview, this blog covers the following data sources: Amazon EKS audit and runtime events Route 53 DNS query logs AWS WAF access logs AWS Lambda execution activity Amazon S3 access events The PowerShell scripts to create these tables are provided here: GithubRepo CreateLAWTables. For assistance with executing these scripts, refer to: Readme.md. Note: In the Microsoft Documentation to Ingest logs from EventHub into Azure Monitor only 3 fields are created in the tables (TimeGenerated, RawData, Properties). In this case, the entire Json event from the Event Hub is sent to the RawData field. However the scripts we are running are creating additional fields since we will be parsing the RawData field and extracting/parsing the information from the complete event into individual fields. This makes it easier to search and analyze logs later and schema improves detection rules and KQL efficiency. Create a Data Collection Endpoint To collect data with a data collection rule, you need a data collection endpoint: Create a data collection endpoint. Note: Create the data collection endpoint in the same region as your Log Analytics workspace. From the data collection endpoint's Overview screen, select JSON View. Copy the Resource ID for the data collection rule. You use this information in the next step while creating Data Collection Rules. Create Data Collection Rules Since we have 5 sources in scope for this example, we need to create 5 DCRs using the collected information as stated here, the user assigned managed identity resource ID, and the Data Collection Endpoint resource ID we created in the previous step. DCR Deployment via ARM Templates The Data Collection Rules ARM templates can be found in the GithubRepo-DataCollectionRules. The instructions to create via ARM templates can be found in the readme.md (Microsoft article reference with manual steps here). DCR Mapping (AWS Sources → DCR Templates) ARM Templates in GitHub Repo to AWS Source mapping Data Source DCR Template Name Amazon EKS Logs DCR-awseks.json AWS S3 Access Logs DCR-awsS3access.json AWS WAF Logs DCR-awswaf.json AWS Lambda Execution DCR-lambdaexecution.json Amazon Route 53 Logs DCR-Route53.json Final Step: Associating the Event Hub with the Data Collection Rule At this stage, the core building blocks of the ingestion pipeline are already in place. We have successfully configured streaming to Event Hubs, created dedicated Event Hubs for each Amazon Security Lake source, provisioned destination tables in the Microsoft Sentinel workspace, and defined Data Collection Rules (DCRs) - leveraging the Azure Monitor pipeline and a user-assigned managed identity to securely read incoming events. The final step is to stitch this entire architecture together by establishing associations between the Event Hubs and their corresponding Data Collection Rules. This association links Event Hub to Sentinel, enabling Azure Monitor to actively pull data from the Event Hubs and route it into the defined destination tables. Without this linkage, the pipeline remains incomplete - data may continue to flow into Event Hubs, but it will not be picked up or ingested into Sentinel. Each Event Hub must be explicitly mapped to its respective Data Collection Rule, ensuring: The correct stream is processed by the intended transformation logic Events are routed to the appropriate custom tables The ingestion pipeline operates in a deterministic and scalable manner helping ensure data is routed to the intended destination in a consistent and predictable manner. Once these associations are configured, the end‑to‑end pipeline is intended to operate as designed, subject to configuration accuracy and ongoing operational monitoring— enabling automated ingestion workflows of Amazon Security Lake data into Microsoft Sentinel with minimal manual intervention once configured. Steps to be followed: Associate the data collection rule with the Event Hub To complete the setup, we now associate each Event Hub with its corresponding Data Collection Rule (DCR). This creates the link that allows Azure Monitor to read data from the Event Hub and send it to Microsoft Sentinel. Important: You must create one association per Data Collection Rule. Example: If an Event Hub is receiving AWS Route 53 logs, it must be associated with the DCR created for AWS Route 53. Copy the template from the above link and deploy it to create each Data Collection Rule association. We have to create 1 association per Data Collection Rule. What You Need Event Hub Resource ID Follow these steps to get the Event Hub Resource ID: Open the Event Hub Namespace in the Azure Portal Go to Entities → Event Hubs Select the Event Hub that is receiving the logs (e.g., Route 53 logs) In the Overview page, click on JSON View Copy the Resource ID Resource Group, Region, and Association Name The Resource Group must be the same as the one where the Event Hub is deployed Provide the Azure region where the resources are deployed Define a unique name for the association Data Collection Rule (DCR) Resource ID Open the corresponding Data Collection Rule Go to the Overview page Click on JSON View Copy the Resource ID Key Note: Make sure that Each Event Hub is mapped to the correct DCR The data source and DCR template are aligned (e.g., Route 53 → Route 53 DCR) Validate End-to-End Ingestion Once this configuration is complete, we can validate the logs in the destination table, fields, and parsing accuracy. If you are building on a lake-first security architecture and running into ingestion challenges with Sentinel, feel free to share your experience in the comments or raise an issue in the GitHub repository.