investigation
315 TopicsAccelerate Agent Development: Hacks for Building with Microsoft Sentinel data lake
As a Senior Product Manager | Developer Architect on the App Assure team working to bring Microsoft Sentinel and Security Copilot solutions to market, I interact with many ISVs building agents on Microsoft Sentinel data lake for the first time. I’ve written this article to walk you through one possible approach for agent development – the process I use when building sample agents internally at Microsoft. If you have questions about this, or other methods for building your agent, App Assure offers guidance through our Sentinel Advisory Service. Throughout this post, I include screenshots and examples from Gigamon’s Security Posture Insight Agent. This article assumes you have: An existing SaaS or security product with accessible telemetry. A small ISV team (2–3 engineers + 1 PM). Focus on a single high value scenario for the first agent. The Composite Application Model (What You Are Building) When I begin designing an agent, I think end-to-end, from data ingestion requirements through agentic logic, following the Composite application model. The Composite Application Model consists of five layers: Data Sources – Your product’s raw security, audit, or operational data. Ingestion – Getting that data into Microsoft Sentinel. Sentinel data lake & Microsoft Graph – Normalization, storage, and correlation. Agent – Reasoning logic that queries data and produces outcomes. End User – Security Copilot or SaaS experiences that invoke the agent. This separation allows for evolving data ingestion and agent logic simultaneously. It also helps avoid downstream surprises that require going back and rearchitecting the entire solution. Optional Prerequisite You are enrolled in the ISV Success Program, so you can earn Azure Credits to provision Security Compute Units (SCUs) for Security Copilot Agents. Phase 1: Data Ingestion Design & Implementation Choose Your Ingestion Strategy The first choice I face when designing an agent is how the data is going to flow into my Sentinel workspace. Below I document two primary methods for ingestion. Option A: Codeless Connector Framework (CCF) This is the best option for ISVs with REST APIs. To build a CCF solution, reference our documentation for getting started. Option B: CCF Push (Public Preview) In this instance, an ISV pushes events directly to Sentinel via a CCF Push connector. Our MS Learn documentation is a great place to get started using this method. Additional Note: In the event you find that CCF does not support your needs, reach out to App Assure so we can capture your requirements for future consideration. Azure Functions remains an option if you’ve documented your CCF feature needs. Phase 2: Onboard to Microsoft Sentinel data lake Once my data is flowing into Sentinel, I onboard a single Sentinel workspace to data lake. This is a one-time action and cannot be repeated for additional workspaces. Onboarding Steps Go to the Defender portal. Follow the Sentinel Data lake onboarding instructions. Validate that tables are visible in the lake. See Running KQL Queries in data lake for additional information. Phase 3: Build and Test the Agent in Microsoft Foundry Once my data is successfully ingested into data lake, I begin the agent development process. There are multiple ways to build agents depending on your needs and tooling preferences. For this example, I chose Microsoft Foundry because it fit my needs for real-time logging, cost efficiency, and greater control. 1. Create a Microsoft Foundry Instance Foundry is used as a tool for your development environment. Reference our QuickStart guide for setting up your Foundry instance. Required Permissions: Security Reader (Entra or Subscription) Azure AI Developer at the resource group After setup, click Create Agent. 2. Design the Agent A strong first agent: Solves one narrow security problem. Has deterministic outputs. Uses explicit instructions, not vague prompts. Example agent responsibilities: To query Sentinel data lake (Sentinel data exploration tool). To summarize recent incidents. To correlate ISVs specific signals with Sentinel alerts and other ISV tables (Sentinel data exploration tool). 3. Implement Agent Instructions Well-designed agent instructions should include: Role definition ("You are a security investigation agent…"). Data sources it can access. Step by step reasoning rules. Output format expectations. Sample Instructions can be found here: Agent Instructions 4. Configure the Microsoft Model Context Protocol (MCP) tooling for your agent For your agent to query, summarize and correlate all the data your connector has sent to data lake, take the following steps: Select Tools, and under Catalog, type Sentinel, and then select Microsoft Sentinel Data Exploration. For more information about the data exploration tool collection in MCP server, see our documentation. I always test repeatedly with real data until outputs are consistent. For more information on testing and validating the agent, please reference our documentation. Phase 4: Migrate the Agent to Security Copilot Once the agent works in Foundry, I migrate it to Security Copilot. To do this: Copy the full instruction set from Foundry Provision a SCU for your Security Copilot workspace. For instructions, please reference this documentation. Make note of this process as you will be charged per hour per SCU Once you are done testing you will need to deprovision the capacity to prevent additional charges Open Security Copilot and use Create From Scratch Agent Builder as outlined here. Add Sentinel data exploration MCP tools (these are the same instructions from the Foundry agent in the previous step). For more information on linking the Sentinel MCP tools, please refer to this article. Paste and adapt instructions. At this stage, I always validate the following: Agent Permissions – I have confirmed the agent has the necessary permissions to interact with the MCP tool and read data from your data lake instance. Agent Performance – I have confirmed a successful interaction with measured latency and benchmark results. This step intentionally avoids reimplementation. I am reusing proven logic. Phase 5: Execute, Validate, and Publish After setting up my agent, I navigate to the Agents tab to manually trigger the agent. For more information on testing an agent you can refer to this article. Now that the agent has been executed successfully, I download the agent Manifest file from the environment so that it can be packaged. Click View code on the Agent under the Build tab as outlined in this documentation. Publishing to the Microsoft Security Store If I were publishing my agent to the Microsoft Security Store, these are the steps I would follow: Finalize ingestion reliability. Document required permissions. Define supported scenarios clearly. Package agent instructions and guidance (by following these instructions). Summary Based on my experience developing Security Copilot agents on Microsoft Sentinel data lake, this playbook provides a practical, repeatable framework for ISVs to accelerate their agent development and delivery while maintaining high standards of quality. This foundation enables rapid iteration—future agents can often be built in days, not weeks, by reusing the same ingestion and data lake setup. When starting on your own agent development journey, keep the following in mind: To limit initial scope. To reuse Microsoft managed infrastructure. To separate ingestion from intelligence. What Success Looks Like At the end of this development process, you will have the following: A Microsoft Sentinel data connector live in Content Hub (or in process) that provides a data ingestion path. Data visible in data lake. A tested agent running in Security Copilot. Clear documentation for customers. A key success factor I look for is clarity over completeness. A focused agent is far more likely to be adopted. Need help? If you have any issues as you work to develop your agent, please reach out to the App Assure team for support via our Sentinel Advisory Service . Or if you have any other tips, please comment below, I’d love to hear your feedback.229Views0likes0CommentsRSAC 2026: New Microsoft Sentinel Connectors Announcement
Microsoft Sentinel helps organizations detect, investigate, and respond to security threats across increasingly complex environments. With the rollout of the Microsoft Sentinel data lake in the fall, and the App Assure-backed Sentinel promise that supports it, customers now have access to long-term, cost-effective storage for security telemetry, creating a solid foundation for emerging Agentic AI experiences. Since our last announcement at Ignite 2025, the Microsoft Sentinel connector ecosystem has expanded rapidly, reflecting continued investment from software development partners building for our shared customers. These connectors bring diverse security signals together, enabling correlation at scale and delivering richer investigation context across the Sentinel platform. Below is a snapshot of Microsoft Sentinel connectors newly available or recently enhanced since our last announcement, highlighting the breadth of partner solutions contributing data into, and extending the value of, the Microsoft Sentinel ecosystem. New and notable integrations Acronis Cyber Protect Cloud Acronis Cyber Protect Cloud integrates with Microsoft Sentinel to bring data protection and security telemetry into a centralized SOC view. The connector streams alerts, events, and activity data - spanning backup, endpoint protection, and workload security - into Microsoft Sentinel for correlation with other signals. This integration helps security teams investigate ransomware and data-centric threats more effectively, leverage built-in hunting queries and detection rules, and improve visibility across managed environments without adding operational complexity. Anvilogic Anvilogic integrates with Microsoft Sentinel to help security teams operationalize detection engineering at scale. The connector streams Anvilogic alerts into Microsoft Sentinel, giving SOC analysts centralized visibility into high-fidelity detections and faster context for investigation and triage. By unifying detection workflows, reducing alert noise, and improving prioritization, this integration supports more efficient threat detection and response while helping teams extend coverage across evolving attack techniques. CyberArk Audit CyberArk Audit integrates with Microsoft Sentinel to centralize visibility into privileged identity and access activity. By streaming detailed audit logs - covering system events, user actions, and administrative activity - into Microsoft Sentinel, security teams can correlate identity-driven risks with broader security telemetry. This integration supports faster investigations, improved monitoring of privileged access, and more effective incident response through automated workflows and enriched context for SOC analysts. Cyera Cyera integrates with Microsoft Sentinel to extend AI-native data security posture management into security operations. The connector brings Cyera’s data context and actionable intelligence across multi-cloud, on-premises, and SaaS environments into Microsoft Sentinel, helping teams understand where sensitive data resides and how it is accessed, exposed, and used. Built on Sentinel’s modern framework, the integration feeds context-rich data risk signals into the Sentinel data lake, enabling more informed threat hunting, automation, and decision-making around data, user, and AI-related risk. TacitRed CrowdStrike IOC Automation Data443 TacitRed CS IOC Automation integrates with Microsoft Sentinel to streamline the operationalization of compromised credential intelligence. The solution uses Sentinel playbooks to automatically push TacitRed indicators of compromise into CrowdStrike via Sentinel playbooks, helping security teams turn identity-based threat intelligence into action. By automating IOC handling and reducing manual effort, this integration supports faster response to credential exposure and strengthens protection against account-driven attacks across the environment. TacitRed SentinelOne IOC Automation Data443 TacitRed SentinelOne IOC Automation integrates with Microsoft Sentinel to help operationalize identity-focused threat intelligence at the endpoint layer. The solution uses Sentinel playbooks to automatically consume TacitRed indicators and push curated indicators into SentinelOne via Sentinel playbooks and API-based enforcement, enabling faster enforcement of high-risk IOCs without manual handling. By automating the flow of compromised credential intelligence from Sentinel into EDR, this integration supports quicker response to identity-driven attacks and improves coordination between threat intelligence and endpoint protection workflows. TacitRed Threat Intelligence Data443 TacitRed Threat Intelligence integrates with Microsoft Sentinel to provide enhanced visibility into identity-based risks, including compromised credentials and high-risk user exposure. The solution ingests curated TacitRed intelligence directly into Sentinel, enriching incidents with context that helps SOC teams identify credential-driven threats earlier in the attack lifecycle. With built-in analytics, workbooks, and hunting queries, this integration supports proactive identity threat detection, faster triage, and more informed response across the SOC. Cyren Threat Intelligence Cyren Threat Intelligence integrates with Microsoft Sentinel to enhance detection of network-based threats using curated IP reputation and malware URL intelligence. The connector ingests Cyren threat feeds into Sentinel using the Codeless Connector Framework (CCF), transforming raw indicators into actionable insights, dashboards, and enriched investigations. By adding context to suspicious traffic and phishing infrastructure, this integration helps SOC teams improve alert accuracy, accelerate triage, and make more confident response decisions across their environments. TacitRed Defender Threat Intelligence Data443 TacitRed Defender Threat Intelligence integrates with Microsoft Sentinel to surface early indicators of credential exposure and identity-driven risk. The solution automatically ingests compromised credential intelligence from TacitRed into Sentinel and can support synchronization of validated indicators with Microsoft Defender Threat Intelligence through Sentinel workflows, helping SOC teams detect account compromise before abuse occurs. By enriching Sentinel incidents with actionable identity context, this integration supports faster triage, proactive remediation, and stronger protection against credential-based attacks. Datawiza Access Proxy (DAP) Datawiza Access Proxy integrates with Microsoft Sentinel to provide centralized visibility into application access and authentication activity. By streaming access and MFA logs from Datawiza into Sentinel, security teams can correlate identity and session-level events with broader security telemetry. This integration supports detection of anomalous access patterns, faster investigation through session traceability, and more effective response using Sentinel automation, helping organizations strengthen Zero Trust controls and meet auditing and compliance requirements. Endace Endace integrates with Microsoft Sentinel to provide deep network visibility by providing always-on, packet-level evidence. The connector enables one-click pivoting from Sentinel alerts directly to recorded packet data captured by EndaceProbes. This helps SOC and NetOps teams reconstruct events and validate threats with confidence. By combining Sentinel’s AI-driven analytics with Endace’s always-on, full-packet capture across on-premises, hybrid, and cloud environments, this integration supports faster investigations, improved forensic accuracy, and more decisive incident response. Feedly Feedly integrates with Microsoft Sentinel to ingest curated threat intelligence directly into security operations workflows. The connector automatically imports Indicators of Compromise (IoCs) from Feedly Team Boards and folders into Sentinel, enriching detections and investigations with context from the original intelligence articles. By bringing analyst‑curated threat intelligence into Sentinel in a structured, automated way, this integration helps security teams stay current on emerging threats and reduce the manual effort required to operationalize external intelligence. Gigamon Gigamon integrates with Microsoft Sentinel through a new connector that provides access to Gigamon Application Metadata Intelligence (AMI), delivering high-fidelity network-derived telemetry with rich application metadata from inspected traffic directly into Sentinel. This added context helps security teams detect suspicious activity, encrypted threats, and lateral movement faster and with greater precision. By enriching analytics without requiring full packet ingestion, organizations can reduce noise, manage SIEM costs, and extend visibility across hybrid cloud infrastructure. Halcyon Halcyon integrates with Microsoft Sentinel to provide purpose-built ransomware detection and automated containment across the Microsoft security ecosystem. The connector surfaces Halcyon ransomware alerts directly within Sentinel, enabling SOC teams to correlate ransomware behavior with Microsoft Defender and broader Microsoft telemetry. By supporting Sentinel analytics and automation workflows, this integration helps organizations detect ransomware earlier, investigate faster using native Sentinel tools, and isolate affected endpoints to prevent lateral spread and reinfection. Illumio The Illumio platform identifies and contains threats across hybrid multi-cloud environments. By integrating AI-driven insights with Microsoft Sentinel and Microsoft Graph, Illumio Insights enables SOC analysts to visualize attack paths, prioritize high-risk activity, and investigate threats with greater precision. Illumio Segmentation secures critical assets, workloads, and devices and then publishes segmentation policy back into Microsoft Sentinel to ensure compliance monitoring. Joe Sandbox Joe Sandbox integrates with Microsoft Sentinel to enrich incidents with dynamic malware and URL analysis. The connector ingests Joe Sandbox threat intelligence and automatically detonates suspicious files and URLs associated with Sentinel incidents, returning behavioral and contextual analysis results directly into investigation workflows. By adding sandbox-driven insights to indicators, alerts, and incident comments, this integration helps SOC teams validate threats faster, reduce false positives, and improve response decisions using deeper visibility into malicious behavior. Keeper Security The Keeper Security integration with Microsoft Sentinel brings advanced password and secrets management telemetry into your SIEM environment. By streaming audit logs and privileged access events from Keeper into Sentinel, security teams gain centralized visibility into credential usage and potential misuse. The connector supports custom queries and automated playbooks, helping organizations accelerate investigations, enforce Zero Trust principles, and strengthen identity security across hybrid environments. Lookout Mobile Threat Defense (MTD) Lookout Mobile Threat Defense integrates with Microsoft Sentinel to extend SOC visibility to mobile endpoints across Android, iOS, and Chrome OS. The connector streams device, threat, and audit telemetry from Lookout into Sentinel, enabling security teams to correlate mobile risk signals such as phishing, malicious apps, and device compromise, with broader enterprise security data. By incorporating mobile threat intelligence into Sentinel analytics, dashboards, and alerts, this integration helps organizations detect mobile driven attacks earlier and strengthen protection for an increasingly mobile workforce. Miro Miro integrates with Microsoft Sentinel to provide centralized visibility into collaboration activity across Miro workspaces. The connector ingests organization-wide audit logs and content activity logs into Sentinel, enabling security teams to monitor authentication events, administrative actions, and content changes alongside other enterprise signals. By bringing Miro collaboration telemetry into Sentinel analytics and dashboards, this integration helps organizations detect suspicious access patterns, support compliance and eDiscovery needs, and maintain stronger oversight of collaborative environments without disrupting productivity. Obsidian Activity Threat The Obsidian Threat and Activity Feed for Microsoft Sentinel delivers deep visibility into SaaS and AI applications, helping security teams detect account compromise and insider threats. By streaming user behavior and configuration data into Sentinel, organizations can correlate application risks with enterprise telemetry for faster investigations. Prebuilt analytics and dashboards enable proactive monitoring, while automated playbooks simplify response workflows, strengthening security posture across critical cloud apps. OneTrust for Purview DSPM OneTrust integrates with Microsoft Sentinel to bring privacy, compliance, and data governance signals into security operations workflows. The connector enriches Sentinel with privacy relevant events and risk indicators from OneTrust, helping organizations detect sensitive data exposure, oversharing, and compliance risks across cloud and non-Microsoft data sources. By unifying privacy intelligence with Sentinel analytics and automation, this integration enables security and privacy teams to respond more quickly to data risk events and support responsible data use and AI-ready governance. Pathlock Pathlock integrates with Microsoft Sentinel to bring SAP-specific threat detection and response signals into centralized security operations. The connector forwards security-relevant SAP events into Sentinel, enabling SOC teams to correlate SAP activity with broader enterprise telemetry and investigate threats using familiar SIEM workflows. By enriching Sentinel with SAP security context and focused detection logic, this integration helps organizations improve visibility into SAP landscapes, reduce noise, and accelerate detection and response for risks affecting critical business systems. Quokka Q-scout Quokka Q-scout integrates with Microsoft Sentinel to centralize mobile application risk intelligence across Microsoft Intune-managed devices. The connector automatically ingests app inventories from Intune, analyzes them using Quokka’s mobile app vetting engines, and streams security, privacy, and compliance risk findings into Sentinel. By surfacing app-level risks through Sentinel analytics and alerts, this integration helps security teams identify malicious or high-risk mobile apps, prioritize remediation, and strengthen mobile security posture without deploying agents or disrupting users. Synqly Synqly integrates with Microsoft Sentinel to simplify and scale security integrations through a unified API approach. The connector enables organizations and security vendors to establish a bi‑directional connection with Sentinel without relying on brittle, point‑to‑point integrations. By abstracting common integration challenges such as authentication handling, retries, and schema changes, Synqly helps teams orchestrate security data flows into and out of Sentinel more reliably, supporting faster onboarding of new data sources and more maintainable integrations at scale. Versasec vSEC:CMS Versasec vSEC:CMS integrates with Microsoft Sentinel to provide centralized visibility into credential lifecycle and system health events. The connector securely streams vSEC:CMS and vSEC:CLOUD alerts and status data into Sentinel using the Codeless Connector Framework (CCF), transforming credential management activity into correlation-ready security signals. By bringing smart card, token, and passkey management telemetry into Sentinel, this integration helps security teams monitor authentication infrastructure health, investigate credential-related incidents, and unify identity security operations within their SIEM workflows. VirtualMetric DataStream VirtualMetric DataStream integrates with Microsoft Sentinel to optimize how security telemetry is collected, normalized, and routed across the Microsoft security ecosystem. Acting as a high-performance telemetry pipeline, DataStream intelligently filters and enriches logs, sending high-value security data to Sentinel while routing less-critical data to Sentinel data lake or Azure Blob Storage for cost-effective retention. By reducing noise upstream and standardizing logs to Sentinel ready schemas, this integration helps organizations control ingestion costs, improve detection quality, and streamline threat hunting and compliance workflows. VMRay VMRay integrates with Microsoft Sentinel to enrich SIEM and SOAR workflows with automated sandbox analysis and high-fidelity, behavior-based threat intelligence. The connector enables suspicious files and phishing URLs to be submitted directly from Sentinel to VMRay for dynamic analysis, while validated, high-confidence indicators of compromise (IOCs) are streamed back into Sentinel’s Threat Intelligence repository for correlation and detection. By adding detailed attack-chain visibility and enriched incident context, this integration helps SOC teams reduce investigation time, improve detection accuracy, and strengthen automated response workflows across Sentinel environments. Zero Networks Segment Audit Zero Networks Segment integrates with Microsoft Sentinel to provide visibility into micro-segmentation and access-control activity across the network. The connector can collect audit logs or activities from Zero Networks Segment, enabling security teams to monitor policy changes, administrative actions, and access events related to MFA-based network segmentation. By bringing segmentation audit telemetry into Sentinel, this integration supports compliance monitoring, investigation of suspicious changes, and faster detection of attempts to bypass lateral-movement controls within enterprise environments. Zscaler Internet Access (ZIA) Zscaler Internet Access integrates with Microsoft Sentinel to centralize cloud security telemetry from web and firewall traffic. The connector enables ZIA logs to be ingested into Sentinel, allowing security teams to correlate Zscaler Internet Access signals with other enterprise data for improved threat detection, investigation, and response. By bringing ZIA web, firewall, and security events into Sentinel analytics and hunting workflows, this integration helps organizations gain broader visibility into internet-based threats and strengthen Zero Trust security operations. In addition to these solutions from our third-party partners, we are also excited to announce the following connector published by the Microsoft Sentinel team: GitHub Enterprise Audit Logs Microsoft’s Sentinel Promise For Customers Every connector in the Microsoft Sentinel ecosystem is built to work out of the box. In the unlikely event a customer encounters any issue with a connector, the App Assure team stands ready to assist. For Software Developers Software partners in need of assistance in creating or updating a Sentinel solution can also leverage Microsoft’s Sentinel Promise to support our shared customers. For developers seeking to build agentic experiences utilizing Sentinel data lake, we are excited to announce the launch of our Sentinel Advisory Service to guide developers across their Sentinel journey. Customers and developers alike can reach out to us via our intake form. Learn More Microsoft Sentinel data lake Microsoft Sentinel data lake: Unify signals, cut costs, and power agentic AI Introducing Microsoft Sentinel data lake What is Microsoft Sentinel data lake Unlocking Developer Innovation with Microsoft Sentinel data lake Microsoft Sentinel Codeless Connector Framework (CCF) Create a codeless connector for Microsoft Sentinel Public Preview Announcement: Microsoft Sentinel CCF Push What’s New in Microsoft Sentinel Monthly Blog Microsoft App Assure App Assure home page App Assure services App Assure blog App Assure Request Assistance Form App Assure Sentinel Advisory Services announcement App Assure’s promise: Migrate to Sentinel with confidence App Assure’s Sentinel promise now extends to Microsoft Sentinel data lake Ignite 2025 new Microsoft Sentinel connectors announcement Microsoft Security Microsoft’s Secure Future Initiative Microsoft Unified SecOps1.4KViews0likes0CommentsUnderstand New Sentinel Pricing Model with Sentinel Data Lake Tier
Introduction on Sentinel and its New Pricing Model Microsoft Sentinel is a cloud-native Security Information and Event Management (SIEM) and Security Orchestration, Automation, and Response (SOAR) platform that collects, analyzes, and correlates security data from across your environment to detect threats and automate response. Traditionally, Sentinel stored all ingested data in the Analytics tier (Log Analytics workspace), which is powerful but expensive for high-volume logs. To reduce cost and enable customers to retain all security data without compromise, Microsoft introduced a new dual-tier pricing model consisting of the Analytics tier and the Data Lake tier. The Analytics tier continues to support fast, real-time querying and analytics for core security scenarios, while the new Data Lake tier provides very low-cost storage for long-term retention and high-volume datasets. Customers can now choose where each data type lands—analytics for high-value detections and investigations, and data lake for large or archival types—allowing organizations to significantly lower cost while still retaining all their security data for analytics, compliance, and hunting. Please flow diagram depicts new sentinel pricing model: Now let's understand this new pricing model with below scenarios: Scenario 1A (PAY GO) Scenario 1B (Usage Commitment) Scenario 2 (Data Lake Tier Only) Scenario 1A (PAY GO) Requirement Suppose you need to ingest 10 GB of data per day, and you must retain that data for 2 years. However, you will only frequently use, query, and analyze the data for the first 6 months. Solution To optimize cost, you can ingest the data into the Analytics tier and retain it there for the first 6 months, where active querying and investigation happen. After that period, the remaining 18 months of retention can be shifted to the Data Lake tier, which provides low-cost storage for compliance and auditing needs. But you will be charged separately for data lake tier querying and analytics which depicted as Compute (D) in pricing flow diagram. Pricing Flow / Notes The first 10 GB/day ingested into the Analytics tier is free for 31 days under the Analytics logs plan. All data ingested into the Analytics tier is automatically mirrored to the Data Lake tier at no additional ingestion or retention cost. For the first 6 months, you pay only for Analytics tier ingestion and retention, excluding any free capacity. For the next 18 months, you pay only for Data Lake tier retention, which is significantly cheaper. Azure Pricing Calculator Equivalent Assuming no data is queried or analyzed during the 18-month Data Lake tier retention period: Although the Analytics tier retention is set to 6 months, the first 3 months of retention fall under the free retention limit, so retention charges apply only for the remaining 3 months of the analytics retention window. Azure pricing calculator will adjust accordingly. Scenario 1B (Usage Commitment) Now, suppose you are ingesting 100 GB per day. If you follow the same pay-as-you-go pricing model described above, your estimated cost would be approximately $15,204 per month. However, you can reduce this cost by choosing a Commitment Tier, where Analytics tier ingestion is billed at a discounted rate. Note that the discount applies only to Analytics tier ingestion—it does not apply to Analytics tier retention costs or to any Data Lake tier–related charges. Please refer to the pricing flow and the equivalent pricing calculator results shown below. Monthly cost savings: $15,204 – $11,184 = $4,020 per month Now the question is: What happens if your usage reaches 150 GB per day? Will the additional 50 GB be billed at the Pay-As-You-Go rate? No. The entire 150 GB/day will still be billed at the discounted rate associated with the 100 GB/day commitment tier bucket. Azure Pricing Calculator Equivalent (100 GB/ Day) Azure Pricing Calculator Equivalent (150 GB/ Day) Scenario 2 (Data Lake Tier Only) Requirement Suppose you need to store certain audit or compliance logs amounting to 10 GB per day. These logs are not used for querying, analytics, or investigations on a regular basis, but must be retained for 2 years as per your organization’s compliance or forensic policies. Solution Since these logs are not actively analyzed, you should avoid ingesting them into the Analytics tier, which is more expensive and optimized for active querying. Instead, send them directly to the Data Lake tier, where they can be retained cost-effectively for future audit, compliance, or forensic needs. Pricing Flow Because the data is ingested directly into the Data Lake tier, you pay both ingestion and retention costs there for the entire 2-year period. If, at any point in the future, you need to perform advanced analytics, querying, or search, you will incur additional compute charges, based on actual usage. Even with occasional compute charges, the cost remains significantly lower than storing the same data in the Analytics tier. Realized Savings Scenario Cost per Month Scenario 1: 10 GB/day in Analytics tier $1,520.40 Scenario 2: 10 GB/day directly into Data Lake tier $202.20 (without compute) $257.20 (with sample compute price) Savings with no compute activity: $1,520.40 – $202.20 = $1,318.20 per month Savings with some compute activity (sample value): $1,520.40 – $257.20 = $1,263.20 per month Azure calculator equivalent without compute Azure calculator equivalent with Sample Compute Conclusion The combination of the Analytics tier and the Data Lake tier in Microsoft Sentinel enables organizations to optimize cost based on how their security data is used. High-value logs that require frequent querying, real-time analytics, and investigation can be stored in the Analytics tier, which provides powerful search performance and built-in detection capabilities. At the same time, large-volume or infrequently accessed logs—such as audit, compliance, or long-term retention data—can be directed to the Data Lake tier, which offers dramatically lower storage and ingestion costs. Because all Analytics tier data is automatically mirrored to the Data Lake tier at no extra cost, customers can use the Analytics tier only for the period they actively query data, and rely on the Data Lake tier for the remaining retention. This tiered model allows different scenarios—active investigation, archival storage, compliance retention, or large-scale telemetry ingestion—to be handled at the most cost-effective layer, ultimately delivering substantial savings without sacrificing visibility, retention, or future analytical capabilities.Solved2KViews2likes5CommentsRSAC 2026: What the Sentinel Playbook Generator actually means for SOC automation
RSAC 2026 brought a wave of Sentinel announcements, but the one I keep coming back to is the playbook generator. Not because it's the flashiest, but because it touches something that's been a real operational pain point for years: the gap between what SOC teams need to automate and what they can realistically build and maintain. I want to unpack what this actually changes from an operational perspective, because I think the implications go further than "you can now vibe-code a playbook." The problem it solves If you've built and maintained Logic Apps playbooks in Sentinel at any scale, you know the friction. You need a connector for every integration. If there isn't one, you're writing custom HTTP actions with authentication handling, pagination, error handling - all inside a visual designer that wasn't built for complex branching logic. Debugging is painful. Version control is an afterthought. And when something breaks at 2am, the person on call needs to understand both the Logic Apps runtime AND the security workflow to fix it. The result in most environments I've seen: teams build a handful of playbooks for the obvious use cases (isolate host, disable account, post to Teams) and then stop. The long tail of automation - the enrichment workflows, the cross-tool correlation, the conditional response chains - stays manual because building it is too expensive relative to the time saved. What's actually different now The playbook generator produces Python. Not Logic Apps JSON, not ARM templates - actual Python code with documentation and a visual flowchart. You describe the workflow in natural language, the system proposes a plan, asks clarifying questions, and then generates the code once you approve. The Integration Profile concept is where this gets interesting. Instead of relying on predefined connectors, you define a base URL, auth method, and credentials for any service - and the generator creates dynamic API calls against it. This means you can automate against ServiceNow, Jira, Slack, your internal CMDB, or any REST API without waiting for Microsoft or a partner to ship a connector. The embedded VS Code experience with plan mode and act mode is a deliberate design choice. Plan mode lets you iterate on the workflow before any code is generated. Act mode produces the implementation. You can then validate against real alerts and refine through conversation or direct code edits. This is a meaningful improvement over the "deploy and pray" cycle most of us have with Logic Apps. Where I see the real impact For environments running Sentinel at scale, the playbook generator could unlock the automation long tail I mentioned above. The workflows that were never worth the Logic Apps development effort might now be worth a 15-minute conversation with the generator. Think: enrichment chains that pull context from three different tools before deciding on a response path, or conditional escalation workflows that factor in asset criticality, time of day, and analyst availability. There's also an interesting angle for teams that operate across Microsoft and non-Microsoft tooling. If your SOC uses Sentinel for SIEM but has Palo Alto, CrowdStrike, or other vendors in the stack, the Integration Profile approach means you can build cross-vendor response playbooks without middleware. The questions I'd genuinely like to hear about A few things that aren't clear from the documentation and that I think matter for production use: Security Copilot dependency: The prerequisites require a Security Copilot workspace with EU or US capacity. Someone in the blog comments already flagged this as a potential blocker for organizations that have Sentinel but not Security Copilot. Is this a hard requirement going forward, or will there be a path for Sentinel-only customers? Code lifecycle management: The generated Python runs... where exactly? What's the execution runtime? How do you version control, test, and promote these playbooks across dev/staging/prod? Logic Apps had ARM templates and CI/CD patterns. What's the equivalent here? Integration Profile security: You're storing credentials for potentially every tool in your security stack inside these profiles. What's the credential storage model? Is this backed by Key Vault? How do you rotate credentials without breaking running playbooks? Debugging in production: When a generated playbook fails at 2am, what does the troubleshooting experience look like? Do you get structured logs, execution traces, retry telemetry? Or are you reading Python stack traces? Coexistence with Logic Apps: Most environments won't rip and replace overnight. What's the intended coexistence model between generated Python playbooks and existing Logic Apps automation rules? I'm genuinely optimistic about this direction. Moving from a low-code visual designer to an AI-assisted coding model with transparent, editable output feels like the right architectural bet for where SOC automation needs to go. But the operational details around lifecycle, security, and debugging will determine whether this becomes a production staple or stays a demo-only feature. Would be interested to hear from anyone who's been in the preview - what's the reality like compared to the pitch?46Views0likes0CommentsAgentic Use Cases for Developers on the Microsoft Sentinel Platform
Interested in building an agent with Sentinel platform solutions but not sure where to start? This blog will help you understand some common use cases for agent development that we’ve seen across our partner ecosystem. SOC teams don’t need more alerts - they need fast, repeatable investigation and response workflows. Security Copilot agents can help orchestrate the steps analysts perform by correlating across the Sentinel data lake, executing targeted KQL queries, fetching related entities, enriching with context, and producing an evidence-backed decision without forcing analysts to switch tools. Microsoft Sentinel platform is a strong foundation for agentic experiences because it exposes a normalized security data layer, an investigation surface based on incidents and entities, and extensive automation capabilities. An agent can use these primitives to correlate identity, endpoint, cloud, and network telemetry; traverse entity relationships; and recommend remediation actions. In this blog, I will break down common agentic use cases that developers can implement on Sentinel platform, framed in buildable and repeatable patterns: Identify the investigation scenario Understand the required Sentinel data connectors and KQL queries Build enrichment and correlation logic Summarize findings with supporting evidence and recommended remediation steps Use Case 1: Identity & Access Intelligence Investigation Scenario: Is this risky sign-in part of an attack path? Signals Correlated: Identity access telemetry: Source user, IPs, target resources, MFA logs Authentication outcomes and diversity: Success vs. failure, Geographic spread Identity risk posture: User risk level/state Post-auth endpoint execution: Suspicious LOLBins Correlation Logic: An analyst receives a risky sign-in signal for a user and needs to determine whether the activity reflects expected behavior - such as travel, remote access, or MFA friction - or if it signals the early stage of an identity compromise that could escalate into privileged access and downstream workload impact. Practical Example: Silverfort Identity Threat Triage Agent, which is built on a similar framework, takes the user’s UPN as input and builds a bounded, last-24-hour investigation across authentication activity, MFA logs, user risk posture, and post-authentication endpoint behavior. Outcome: By correlating identity risk signals, MFA logs, sign-in success and failure patterns, and suspicious execution activity following authentication, the agent connects the initial risky sign-in to endpoint behavior, enabling the analyst to quickly assess compromise likelihood, identify escalation indicators, and determine appropriate remediation actions. “Our collaboration with Microsoft Sentinel and Security Copilot underscores the central role identity plays across every stage of attack path triage. By integrating Silverfort’s identity risk signals with Microsoft Entra ID and Defender for Endpoint, and sharing rich telemetry across platforms, we enable Security Copilot Agent to distinguish isolated anomalies from true identity-driven intrusions - while dramatically reducing the manual effort traditionally required for incident response and threat hunting. AI-driven agents accelerate analysis, enrich investigative context, reduce dwell time, and speed detection. Instead of relying on complex queries or deep familiarity with underlying data structures, security teams can now perform seamless, identity-centric reasoning within a single interaction.” - Frank Gasparovic, Director of Solution Architecture, Technology Alliances, Silverfort Use Case 2: Cyber Resilience, Backup & Recovery Investigation Scenario: Are the threats detected on a backup indicative of production impact and recovery risk? Signals Correlated: Backup threat telemetry: Backup threat scan alerts, risk analysis events, affected host/workload, detection timestamps Cross-vendor security alerts: Endpoint, network, and cloud security alerts for the same host/workload in the same time window Correlation Logic: The agent correlates threat signals originating from the backup environment with security telemetry associated with same host/workload to validate whether there is corroborating evidence in the production environment and whether activity aligns in time. Practical Example: Commvault Security Investigation Agent, which is built on a similar framework, takes a hostname as input and builds an investigation across Commvault Threat Scan / Risk Analysis events and third-party security telemetry. By correlating backup-originating detections with production security activity for the same host, the agent determines whether the backup threat signal aligns with observable production impact. Outcome: By correlating backup threat detections with endpoint, network, and cloud security telemetry while validating timing alignment, event spikes, and data coverage, the agent connects a backup originating threat signal to production evidence, enabling the analyst to quickly assess impact likelihood and determine appropriate actions such as containment or recovery-point validation. Use Case 3: Network, Exposure & Connectivity Investigation Scenario: Is this activity indicative of legitimate remote access, or does it demonstrate suspicious connectivity and access attempts that increase risk to private applications and internal resources. Signals Correlated: User access telemetry: Source user, source IPs/geo, device/context, destinations Auth and enforcement outcomes: Success vs. failure, MFA allow/block Behavior drift: new/rare IPs/locations, unusual destination/app diversity. Suspicious activity indicators: Risky URLs/categories, known-bad indicators, automated/bot-like patterns, repeated denied private app access attempts Correlation Logic: An analyst receives an alert for a specific user and needs to determine whether the activity reflects expected behavior such as travel, remote work, or VPN usage, or whether it signals the early stages of a compromise that could later extend into private application access. Practical Example: Zscaler ZIA ZPA Correlation Agent starts with a username and builds a bounded, last-24-hour investigation across Zscaler Internet Access and Zscaler Private Access activity. By correlating user internet behavior, access context, and private application interactions, the agent connects the initial Zscaler alert to any downstream access attempts or authentication anomalies, enabling the analyst to quickly assess risk, identify suspicious patterns, and determine whether Zscaler policy adjustments are required. Outcome: Provides a last‑24‑hour verdict on whether the activity reflects expected access patterns or escalation toward private application access, and recommends next actions—such as closing as benign drift, escalating for containment, or tuning access policy—based on correlated evidence. Use Case 4: Endpoint & Runtime Intelligence Investigation Scenario: Is this process malicious or a legitimate admin action? Signals Correlated: Execution context: Process chain, full command line, signer, unusual path Account & logon: Initiating user, logon type (RDP/service), recent risky sign-ins Tooling & TTPs: LOLBins, credential access hints, lateral movement tooling Network behavior: Suspicious connections, repeated callbacks/beaconing Correlation Logic: A PowerShell alert triggers on a production server. The agent ties the process to its parent (e.g., spawned by a web worker vs. an admin shell), validates the command-line indicators, correlates outbound connections from the same PID to a first-seen destination, and checks for immediate follow-on persistence and any adjacent runtime alerts in the same time window. Outcome: Classifies the activity as malicious vs. admin and produces an evidence pack (process tree, key command indicators, destinations, persistence/tamper artifacts) as well as the recommended containment step (isolate host and revoke/reset initiating credentials). Use Case 5: Exposure & Exploitability Investigation Scenario: What is the likelihood of exploitation and blast radius? Signals Correlated: Asset exposure: Internet-facing status, exposed services/ports, and identity or network paths required to reach the workload Exploit activity: Defender alerts on the resource, IDS/WAF hits, IOC matches, and first seen exploit or probing attempts Risk amplification signals: Internet communication, high privilege access paths, and indicators that the workload processes PII or sensitive data Blast radius: Downstream reachability to crown jewel systems (e.g., databases, key vaults) and trust relationships that could enable escalation Correlation Logic: An analyst receives a Medium/High Microsoft Defender for Cloud alert on a workload and needs to determine whether it’s a standalone detection or an exploitable exposure that can quickly progress into privilege abuse and data impact. The agent correlates exposure evidence signals such as internet reachability, high-privilege paths, and indicators that workload handles sensitive data by analyzing suspicious network connections in the same bounded time window. Outcome: Produces a resource-specific risk analysis that explains why the Defender for Cloud alert is likely to be exploited, based on asset attack surface and effective privileges, plus any supporting activity in the same 24-hour window. Use Case 6: Threat Intelligence & Adversary Context Investigation Scenario: Is this activity aligned with known attacker behavior? Signals Correlated: Behavior sequence: ordered events identity → execution → network. Technique mapping: MITRE ATT&CK technique IDs, typical progression, and required prerequisites. Threat intel match: campaign/adversary, TTPs, IOCs Correlation Logic: A chain of identity compromise, PowerShell obfuscation, and periodic outbound HTTPS is observed. The agent maps the sequence to ATT&CK techniques and correlates it with threat intel that matches a known adversary campaign. Outcome: Surfaces adversary-aligned behavioral insights and TTP context to help analysts assess intrusion likelihood and guide the next investigation steps. Summary This blog is intended to help developers better understand the key use cases for building agents with Microsoft Sentinel platform along with practical patterns to apply when designing and implementing agent scenarios. Need help? If you have any issues as you work to develop your agent, the App Assure team is available to assist via our Sentinel Advisory Service. Reach out via our intake form. Resources Learn more: For a practical overview of how ISVs can move from Sentinel data lake onboarding to building agents, see the Accelerate Agent Development blog - https://aka.ms/AppAssure_AccelerateAgentDev. Get hands-on: Explore the end-to-end journey from Sentinel data lake onboarding to a working Security Copilot agent through the accompanying lab modules available on GitHub Repo: https://github.com/suchandanreddy/Microsoft-Sentinel-Labs.704Views1like0CommentsMicrosoft Sentinel data lake FAQ
Microsoft Sentinel data lake (generally available) is a purpose‑built, cloud‑native security data lake. It centralizes all security data in an open format, serving as the foundation for agentic defense, enhanced security insights, and graph-based enrichment. It offers cost‑effective ingestion, long‑term retention, and advanced analytics. In this blog we offer answers to many of the questions we’ve heard from our customers and partners. General questions What is the Microsoft Sentinel data lake? Microsoft has expanded its industry-leading SIEM solution, Microsoft Sentinel, to include a unified, security data lake, designed to help optimize costs, simplify data management, and accelerate the adoption of AI in security operations. This modern data lake serves as the foundation for the Microsoft Sentinel platform. It has a cloud-native architecture and is purpose-built for security—bringing together all security data for greater visibility, deeper security analysis, contextual awareness and agentic defense. It provides affordable, long-term retention, allowing organizations to maintain robust security while effectively managing budgetary requirements. What are the benefits of Sentinel data lake? Microsoft Sentinel data lake is purpose built for security offering flexible analytics, cost management, and deeper security insights. Sentinel data lake: Centralizes security data delta parquet and open format for easy access. This unified data foundation accelerates threat detection, investigation, and response across hybrid and multi-cloud environments. Enables data federation by allowing customers to access data in external sources like Microsoft Fabric, ADLS and Databricks from the data lake. Federated data appears alongside native Sentinel data, enabling correlated hunting, investigation, and custom graph analysis across a broader digital estate. Offers a disaggregated storage and compute pricing model, allowing customers to store massive volumes of security data at a fraction of the cost compared to traditional SIEM solutions. Allows multiple analytics engines like Kusto, Spark, and ML to run on a single data copy, simplifying management, reducing costs, and supporting deeper security analysis. Integrates with GitHub Copilot and VS Code empowering SOC teams to automate enrichment, anomaly detection, and forensic analysis. Supports AI agents via the MCP server, allowing tools like GitHub Copilot to query and automate security tasks. The MCP Server layer brings intelligence to the data, offering Semantic Search, Query Tools, and Custom Analysis capabilities that make it easier to extract insights and automate workflows. Provides streamlined onboarding, intuitive table management, and scalable multi-tenant support, making it ideal for MSSPs and large enterprises. The Sentinel data lake is designed for security workloads, ensuring that processes from ingestion to analytics meet evolving cybersecurity requirements. Is Microsoft Sentinel SIEM going away? No. Microsoft is expanding Sentinel into an AI powered end-to-end security platform that includes SIEM and new platform capabilities - Security data lake, graph-powered analytics and MCP Server. SIEM remains a core component and will be actively developed and supported. Getting started What are the prerequisites for Sentinel data lake? To get started: Connect your Sentinel workspace to Microsoft Defender prior to onboarding to Sentinel data lake. Once in the Defender experience see data lake onboarding documentation for next steps. Note: Sentinel is moving to the Microsoft Defender portal and the Sentinel Azure portal will be retired by March 31, 2027. I am a Sentinel-only customer, and not a Defender customer. Can I use the Sentinel data lake? Yes. You must connect Sentinel to the Defender experience before onboarding to the Sentinel data lake. Microsoft Sentinel is generally available in the Microsoft Defender portal, with or without Microsoft Defender XDR or an E5 license. If you have created a log analytics workspace, enabled it for Sentinel and have the right Microsoft Entra roles (e.g. Global Administrator + Subscription Owner, Security Administrator + Sentinel Contributor), you can enable Sentinel in the Defender portal. For more details on how to connect Sentinel to Defender review these sources: Microsoft Sentinel in the Microsoft Defender portal In what regions is Sentinel data lake available? For supported regions see: Geographical availability and data residency in Microsoft Sentinel | Azure Docs. Is there an expected release date for Microsoft Sentinel data lake in GCC, GCC-H, and DoD? While the exact date is not yet finalized, we plan to expand Sentinel data lake to the US Government environments. . How will URBAC and Entra RBAC work together to manage the data lake given there is no centralized model? Entra RBAC will provide broad access to the data lake (URBAC maps the right permissions to specific Entra role holders: GA/SA/SO/GR/SR). URBAC will become a centralized pane for configuring non-global delegated access to the data lake. For today, you will use this for the “default data lake” workspace. In the future, this will be enabled for non-default Sentinel workspaces as well – meaning all workspaces in the data lake can be managed here for data lake RBAC requirements. Azure RBAC on the Log Analytics (LA) workspace in the data lake is respected through URBAC as well today. If you already hold a built-in role like log analytics reader, you will be able to run interactive queries over the tables in that workspace. Or, if you hold log analytics contributor, you can read and manage table data. For more details see: Roles and permissions in the Microsoft Sentinel platform | Microsoft Learn Data ingestion and storage How do I ingest data into the Sentinel data lake? To ingest data into the Sentinel data lake, you can use existing Sentinel data connectors or custom connectors to bring data from Microsoft and third-party sources. Data can be ingested into the analytics tier or the data lake tier. Data ingested into the analytics tier is automatically mirrored to the lake (at no additional cost). Alternatively, data that is not needed in the analytics tier can be ingested directly into the data lake. Data retention is configured directly in table management, for both analytics retention and data lake storage. Note: Certain tables do not support data lake-only ingestion via either API or data connector UI. See here for more information: Custom log tables. What is Microsoft’s guidance on when to use analytics tier vs. the data lake tier? Sentinel data lake offers flexible, built-in data tiering (analytics and data lake tiers) to effectively meet diverse business use cases and achieve cost optimization goals. Analytics tier: Is ideal for high-performance, real-time, end-to-end detections, enrichments, investigation and interactive dashboards. Typically, high-fidelity data from EDRs, email gateways, identity, SaaS and cloud logs, threat intelligence (TI) should be ingested into the analytics tier. Data in the analytics tier is best monitored proactively with scheduled alerts and scheduled analytics to enable security detections Data in this tier is retained at no cost for up to 90 days by default, extendable to 2 years. A copy of the data in this tier is automatically available in the data lake tier at no extra cost, ensuring a unified copy of security data for both tiers. Data lake tier: Is designed for cost-effective, long-term storage. High-volume logs like NetFlow logs, TLS/SSL certificate logs, firewall logs and proxy logs are best suited for data lake tier. Customers can use these logs for historical analysis, compliance and auditing, incident response (IR), forensics over historical data, build tenant baselines, TI matching and then promote resulting insights into the analytics tier. Customers can run full Kusto queries, Spark Notebooks and scheduled jobs over a single copy of their data in the data lake. Customers can also search, enrich and promote data from the data lake tier to the analytics tier for full analytics. For more details see documentation. What does it mean that a copy of all new analytics tier data will be available in the data lake? When Sentinel data lake is enabled, a copy of all new data ingested into the analytics tier is automatically duplicated into the data lake tier. This means customers don’t need to manually configure or manage this process, every new log or telemetry added to the analytics tier becomes instantly available in the data lake. This allows security teams to run advanced analytics, historical investigations, and machine learning models on a single, unified copy of data in the lake, while still using the analytics tier for real-time SOC workflows. It’s a seamless way to support both operational and long-term use cases—without duplicating effort or cost. What is the guidance for customers using data federation capability in Sentinel data lake? Starting April 1, 2026, federate data from Microsoft Fabric, ADLS, and Azure Databricks into Sentinel data lake. Use data federation when data is exploratory, infrequently accessed, or must remain at source due to governance, compliance, sovereignty, or contractual requirements. Ingest data directly into Sentinel to unlock full SIEM capabilities, always-on detections, advanced automation, and AI‑driven defense at scale. This approach lets security teams start where their data already lives — preserving governance, then progressively ingest data into Sentinel for full security value. Is there any cost for retention in the analytics tier? Analytics ingestion includes 90 days of interactive retention, at no additional cost. Simply set analytics retention to 90 days or less. Analytics retention beyond 90 days will incur a retention cost. Data can be retained longer within the data lake by using the “total retention” setting. This allows you to extend retention within the data lake for up to 12 years. While data is retained within the analytics tier, there is no charge for the mirrored data within the lake. Retaining data in the lake beyond the analytics retention period incurs additional storage costs. See documentation for more details: Manage data tiers and retention in Microsoft Sentinel | Microsoft Learn What is the guidance for Microsoft Sentinel Basic and Auxiliary Logs customers? If you previously enabled Basic or Auxiliary Logs plan in Sentinel: You can view Basic Logs in the Defender portal but manage it from the Log Analytics workspace. To manage it in the Defender portal, you must change the plan from Basic to Analytics. Once the table is transitioned to the analytics tier, if desired, it can then be transitioned to the data lake. Existing Auxiliary Log tables will be available in the data lake tier for use once the Sentinel data lake is enabled. Billing for these tables will automatically switch to the Sentinel data lake meters. Microsoft Sentinel customers are recommended to start planning their data management strategy with the data lake. While Basic and Auxiliary Logs are still available, they are not being enhanced further. Sentinel data lake offers more capabilities at a lower price point. Please plan on onboarding your security data to the Sentinel data lake. Azure Monitor customers can continue to use Basic and Auxiliary Logs for observability scenarios. What happens to customers that already have Archive logs enabled? If a customer has already configured tables for Archive retention, existing retention settings will not change and will be automatically inherited by the Sentinel data lake. All data, including existing data in archive retention will be billed using the data lake storage meter, benefiting from 6x data compression. However, the data itself will not move. Existing data in archive will continue to be accessible through Sentinel search and restore experiences: o Data will not be backfilled into the data lake. o Data will be billed using the data lake storage meter. New data ingested after enabling the data lake: o Will be automatically mirrored to the data lake and accessible through data lake explorer. o Data will be billed using the data lake storage meter. Example: If a customer has 12 months of total retention enabled on a table, 2 months after enabling ingestion into the Sentinel data lake, the customer will still have access to 10 months of archived data (through Sentinel search and restore experiences), but access to only 2 months of data in the data lake (since the data lake was enabled). Key considerations for customers that currently have Archive logs enabled: The existing archive will remain, with new data ingested into the data lake going forward; previously stored archive data will not be backfilled into the lake. Archive logs will continue to be accessible via the Search and Restore tab under Sentinel. If analytics and data lake mode are enabled on table, which is the default setting for analytics tables when Sentinel data lake is enabled, all new data will be ingested into the Sentinel data lake. There will only be one storage meter (which is data lake storage) going forward. Archive will continue to be accessible via Search and Restore. If Sentinel data lake-only mode is enabled on table, new data will be ingested only into the data lake; any data that’s not already in the Sentinel data lake won’t be migrated/backfilled. Only data that was previously ingested under the archive plan will be accessible via Search and Restore. What is the guidance for customers using Azure Data Explorer (ADX) alongside Microsoft Sentinel? Some customers might have set up ADX cluster for their DIY lake setup. Customers can choose to continue using that setup and gradually migrate to Sentinel data lake for new data that they want to manage. The lake explorer will support federation with ADX to enable the customers to migrate gradually and simplify their deployment. What happens to the Defender XDR data after enabling Sentinel data lake? By default, Defender XDR tables are available for querying in advanced hunting, with 30 days of analytics tier retention included with the XDR license. To retain data beyond this period, an explicit change to the retention setting is required, either by extending the analytics tier retention or the total retention period. You can extend the retention period of supported Defender XDR tables beyond 30 days and ingest the data into the analytics tier. For more information see Manage XDR data in Microsoft Sentinel. You can also ingest XDR data directly into the data lake tier. See here for more information. A list of XDR advanced hunting tables supported by Sentinel are documented here: Connect Microsoft Defender XDR data to Microsoft Sentinel | Microsoft Learn. KQL queries and jobs Is KQL and Notebook supported over the Sentinel data lake? Yes, via the data lake KQL query experience along with a fully managed Notebook experience which enables spark-based big data analytics over a single copy of all your security data. Customers can run queries across any time range of data in their Sentinel data lake. In the future, this will be extended to enable SQL query over lake as well. Note: Triggering a KQL job directly via an API or Logic App is not yet supported but is on the roadmap. Why are there two different places to run KQL queries in Sentinel experience? Advanced hunting queries both XDR and analytics tables, with compute cost included. Data lake explorer only queries data in the lake and incurs a separate compute cost. Consolidating advanced hunting and KQL explorer user interfaces is on the roadmap. This will provide security analysts a unified query experience across both analytics and data lake tiers. Where is the output from KQL jobs stored? KQL jobs are written into existing or new custom tables in the analytics tier. Is it possible to run KQL queries on multiple data lake tables? Yes, you can run KQL interactive queries and jobs using operators like join or union. Can KQL queries (either interactive or via KQL jobs) join data across multiple workspaces? Security teams can run multi-workspace KQL queries for broader threat correlation Pricing and billing How does a customer pay for Sentinel data lake? Billing is automatically enabled at the time of onboarding based on Azure Subscription and Resource Group selections. Customers are then charged based on the volume of data ingested, retained, and analyzed (e.g. KQL Queries and Jobs). See Sentinel pricing page for more details. 2. What are the pricing components for Sentinel data lake? Sentinel data lake offers a flexible pricing model designed to optimize security coverage and costs. At a high level, pricing is based on the volume of data ingested/processed, the volume of data retained, and the volume of data processed. For specific meter definitions, see documentation. 3. How does the business model for Sentinel SIEM change with the introduction of the data lake? There is no change to existing Sentinel analytics tier ingestion business model. Sentinel data lake has separate meters for ingestion, storage and analytics. 4. What happens to the existing Sentinel SIEM and related Azure Monitor billing meters when a customer onboards to Sentinel data lake? When a customer onboards to the Sentinel data lake, nothing changes with analytic ingestion or retention. Customers using data archive and Auxiliary Logs will automatically transition to the new data lake meters. How does data lake storage affect cost efficiency for high volume data retention? Sentinel data lake offers cost-effective, long-term storage with uniform data compression of 6:1 across all data sources, applicable only to data lake storage. Example: For 600GB of data stored, you are only billed for 100GB compressed data. This approach allows organizations to retain greater volumes of security data over extended periods cost-effectively, thereby reducing security risks without compromising their overall security posture. here How “Data Processing” billed? To support the ingestion and standardization of diverse data sources, the Data Processing feature applies a $0.10 per GB (US East) charge for all data ingested into the data lake. This feature enables a broad array of transformations like redaction, splitting, filtering and normalization. The data processing charge is applied per GB of uncompressed data Note: For regional pricing, please refer to the “Data processing” meter within the Microsoft Sentinel Pricing official documentation. Does “Data processing” meter apply to analytics tier data mirrored in the data lake? No. Data processing charge will not be applied to mirrored data. Data mirrored from the analytic tier is not subject to either data ingestion or processing charges. How is retention billed for tables that use data lake-only ingestion & retention? Sentinel data lake decouples ingestion, storage, and analytics meters. Customers have the flexibility to pay based on how data is retained and used. For tables that use data lake‑only ingestion, there is no included free retention—unlike the analytics tier, which includes 90 days of analytics retention. Retention charges begin immediately once data is stored in the data lake. Data lake storage billing is based on compressed data size rather than raw ingested volume, which significantly reduces storage costs and delivers lower overall retention spend for customers. Does data federation incur charges? Data federation does not generate any ingestion or storage fees in Sentinel data lake. Customers are billed only when they run analytics or queries on federated data, with charges based on Sentinel data lake compute and analytics meters. This means customers pay solely for actual data usage, not mere connectivity. How do I understand Sentinel data lake costs? Sentinel data lake costs driven by three primary factors: how much data is ingested, how long that data is retained, and how the data is used. Customers can flexibly choose to ingest data into the analytics tier or data lake tier, and these architectural choices directly impact cost. For example, data can be ingested into the analytics tier—where commitment tiers help optimize costs for high data volumes—or ingested data directly into the Sentinel data lake for lower‑cost ingestion, storage, and on‑demand analysis. Customers are encouraged to work with their Microsoft account team to obtain an accurate cost estimate tailored to their environment. See Sentinel pricing page to understand Sentinel pricing. How do I manage Sentinel data lake costs? Built-in cost management experiences help customers with cost predictability, billing transparency, and operational efficiency. Reports provide customers with insights into usage trends over time, enabling them to identify cost drivers and optimize data retention and processing strategies. Set usage-based alerts on specific meters to monitor and control costs. For example, receive alerts when query or notebook usage passes set limits, helping avoid unexpected expenses and manage budgets. See our Sentinel cost management documentation to learn more. If I’m an Auxiliary Logs customer, how will onboarding to the Sentinel data lake affect my billing? Once a workspace is onboarded to Sentinel data lake, all Auxiliary Logs meters will be replaced by new data lake meters. Do we charge for data lake ingestion and storage for graph experiences? Microsoft Sentinel graph-based experiences are included as part of the existing Defender and Purview licenses. However, Sentinel graph requires Sentinel data lake and specific data sources to build the underlying graph. Enabling these data sources will incur ingestion and data lake storage costs. Note: For Sentinel SIEM customers, most required data sources are free for analytics ingestion. Non-entitled sources such as Microsoft Entra ID logs will incur ingestion and data lake storage costs. How is Entra asset data and ARG data billed? Data lake ingestion charges of $0.05 per GB (US EAST) will apply to Entra asset data and ARG data. Note: This was previously not billed during public preview and is billed since data lake GA. To learn more, see: https://learn.microsoft.com/azure/sentinel/datalake/enable-data-connectors When a customer activates Sentinel data lake, what happens to tables with archive logs enabled? To simplify billing, once the data lake is enabled, all archive data will be billed using the data lake storage meter. This provides consistent long-term retention billing and includes automatic 6x data compression. For most customers, this change results in lower long‑term retention costs. However, customers who previously had discounted archive retention pricing will not automatically receive the same discounts on the new data lake storage meters. In these cases, customers should engage their Microsoft account team to review pricing implications before enabling the Sentinel data lake. Thank you Thank you to our customers and partners for your continued trust and collaboration. Your feedback drives our innovation, and we’re excited to keep evolving Microsoft Sentinel to meet your security needs. If you have any questions, please don’t hesitate to reach out—we’re here to support you every step of the way. Learn more: Get started with Sentinel data lake today: https://aka.ms/Get_started/Sentinel_datalake Microsoft Sentinel AI-ready platform: https://aka.ms/Microsoft_Sentinel Sentinel data lake videos: https://aka.ms/Sentineldatalake_videos Latest innovations and updates on Sentinel: https://aka.ms/msftsentinelblog Sentinel pricing page: https://aka.ms/MicrosoftSentinel_Pricing5KViews1like8CommentsWhy there is no Signature status for the new process in the DeviceProcessEvent table?
According to the schema, there is only field for checking the initiating (parent) process digital signature, named InitiatingProcessSignatureStatus. So we have information if the process that initiated the execution is signed. However, in many security use-cases it is important to know if the spawned (child) process is digitally signed. Let's assume that Winword.exe (signed) executed unsigned binary - this is definitely different situation than Winword.exe executing some signed binary (although both may be suspicious, or legitimate). I feel that some valuable information is not provided, and I'd like to know the reason. Is it related to the logging performance? Or some memory structures, that are present only for the already existing process?123Views0likes2CommentsEndpoint and EDR Ecosystem Connectors in Microsoft Sentinel
Most SOCs operate in mixed endpoint environments. Even if Microsoft Defender for Endpoint is your primary EDR, you may still run Cisco Secure Endpoint, WithSecure Elements, Knox, or Lookout in specific regions, subsidiaries, mobile fleets, or regulatory enclaves. The goal is not to replace any tool, but to standardize how signals become detections and response actions. This article explains an engineering-first approach: ingestion correctness, schema normalization, entity mapping, incident merging, and cross-platform response orchestration. Think of these connectors as four different lenses on endpoint risk. Two provide classic EDR detections (Cisco, WithSecure). Two provide mobile security and posture signals (Knox, Lookout). The highest-fidelity outcomes come from correlating them with Microsoft signals (Defender for Endpoint device telemetry, Entra ID sign-ins, and threat intelligence). Cisco Secure Endpoint Typical signal types include malware detections, exploit prevention events, retrospective detections, device isolation actions, and file/trajectory context. Cisco telemetry is often hash-centric (SHA256, file reputation) which makes it excellent for IOC matching and cross-EDR correlation. WithSecure Elements WithSecure Elements tends to provide strong behavioral detections and ransomware heuristics, often including process ancestry and behavioral classification. It complements hash-based detections by providing behavior and incident context that can be joined to Defender process events. Samsung Knox Asset Intelligence Knox is posture-heavy. Typical signals include compliance state, encryption status, root/jailbreak indicators, patch level, device model identifiers and policy violations. This data is extremely useful for identity correlation: it helps answer whether a successful sign-in came from a device that should be trusted. Lookout Mobile Threat Defense Lookout focuses on mobile threats such as malicious apps, phishing, risky networks (MITM), device compromise indicators, and risk scores. Lookout signals are critical for identity attack chains because mobile phishing is often the precursor to token theft or credential reuse. 2. Ingestion architecture: from vendor API to Sentinel tables Most third‑party connectors are API-based. In production, treat ingestion as a pipeline with reliability requirements. The standard pattern is vendor API → connector runtime (codeless connector or Azure Function) → DCE → DCR transform → Log Analytics table. Key engineering controls: Secrets and tokens should be stored in Azure Key Vault where supported; rotate and monitor auth failures. Use overlap windows (poll slightly more than the schedule interval) and deduplicate by stable event IDs. Use DCR transforms to normalize fields early (device/user/IP/severity) and to filter obviously low-value noise. Monitor connector health and ingestion lag; do not rely on ‘Connected’ status alone. Ingestion health checks (KQL) // Freshness & lag per connector table (adapt table names to your workspace) let lookback = 24h union isfuzzy=true (<CiscoTable> | extend Source="Cisco"), (<WithSecureTable> | extend Source="WithSecure"), (<KnoxTable> | extend Source="Knox"), (<LookoutTable> | extend Source="Lookout") | where TimeGenerated > ago(lookback) | summarize LastEvent=max(TimeGenerated), Events=count() by Source | extend IngestDelayMin = datetime_diff("minute", now(), LastEvent) | order by IngestDelayMin desc // Schema discovery (run after onboarding and after connector updates) Cisco | take 1 | getschema WithSecureTable | take 1 | getschema Knox | take 1 | getschema Lookout | take 1 | getschema 3. Normalization: make detections vendor-agnostic The most common failure mode in multi-EDR SOCs is writing separate rules per vendor. Instead, build one normalization function that outputs a stable schema. Then write rules once. Recommended canonical fields: Vendor, AlertId, EventTime, SeverityNormalized DeviceName (canonical), AccountUpn (canonical), SourceIP FileHash (when applicable), ThreatName/Category CorrelationKey (stable join key such as DeviceName + FileHash or DeviceName + AlertId) // Example NormalizeEndpoint() pattern. Replace column_ifexists(...) mappings after getschema(). let NormalizeEndpoint = () { union isfuzzy=true ( Cisco | extend Vendor="Cisco" | extend DeviceName=tostring(column_ifexists("hostname","")), AccountUpn=tostring(column_ifexists("user","")), SourceIP=tostring(column_ifexists("ip","")), FileHash=tostring(column_ifexists("sha256","")), ThreatName=tostring(column_ifexists("threat_name","")), SeverityNormalized=tolower(tostring(column_ifexists("severity",""))) ), ( WithSecure | extend Vendor="WithSecure" | extend DeviceName=tostring(column_ifexists("hostname","")), AccountUpn=tostring(column_ifexists("user","")), SourceIP=tostring(column_ifexists("ip","")), FileHash=tostring(column_ifexists("file_hash","")), ThreatName=tostring(column_ifexists("classification","")), SeverityNormalized=tolower(tostring(column_ifexists("risk_level",""))) ), ( Knox | extend Vendor="Knox" | extend DeviceName=tostring(column_ifexists("device_id","")), AccountUpn=tostring(column_ifexists("user","")), SourceIP="", FileHash="", ThreatName=strcat("Device posture: ", tostring(column_ifexists("compliance_state",""))), SeverityNormalized=tolower(tostring(column_ifexists("risk",""))) ), ( Lookout | extend Vendor="Lookout" | extend DeviceName=tostring(column_ifexists("device_id","")), AccountUpn=tostring(column_ifexists("user","")), SourceIP=tostring(column_ifexists("source_ip","")), FileHash="", ThreatName=tostring(column_ifexists("threat_type","")), SeverityNormalized=tolower(tostring(column_ifexists("risk_level",""))) ) | extend CorrelationKey = iff(isnotempty(FileHash), strcat(DeviceName, "|", FileHash), strcat(DeviceName, "|", ThreatName)) | project TimeGenerated, Vendor, DeviceName, AccountUpn, SourceIP, FileHash, ThreatName, SeverityNormalized, CorrelationKey, * } 4. Entity mapping and incident merging Sentinel’s incident experience improves dramatically when alerts include entity mapping. Map Host, Account, IP, and File (hash) where possible. Incident grouping should merge alerts by DeviceName and AccountUpn within a reasonable window (e.g., 6–24 hours) to avoid alert storms. 5. Correlation patterns that raise confidence High-confidence detections come from confirmation across independent sensors. These patterns reduce false positives while catching real compromise chains. 5.1 Multi-vendor confirmation (two EDRs agree) NormalizeEndpoint() | where TimeGenerated > ago(24h) | summarize Vendors=dcount(Vendor), VendorSet=make_set(Vendor, 10) by DeviceName | where Vendors >= 2 5.2 Third-party detection confirmed by Defender process telemetry let tp = NormalizeEndpoint() | where TimeGenerated > ago(6h) | where ThreatName has_any ("powershell","ransom","credential","exploit") | project TPTime=TimeGenerated, DeviceName, AccountUpn, Vendor, ThreatName tp | join kind=inner ( DeviceProcessEvents | where Timestamp > ago(6h) | where ProcessCommandLine has_any ("EncodedCommand","IEX","FromBase64String","rundll32","regsvr32") | project MDETime=Timestamp, DeviceName=tostring(DeviceName), Proc=ProcessCommandLine ) on DeviceName | where MDETime between (TPTime .. TPTime + 30m) | project TPTime, MDETime, DeviceName, Vendor, ThreatName, Proc 5.3 Mobile phishing signal followed by successful sign-in let mobile = NormalizeEndpoint() | where TimeGenerated > ago(24h) | where Vendor == "Lookout" and ThreatName has "phish" | project MTDTime=TimeGenerated, AccountUpn, DeviceName, SourceIP mobile | join kind=inner ( SigninLogs | where TimeGenerated > ago(24h) | where ResultType == 0 | project SigninTime=TimeGenerated, AccountUpn=tostring(UserPrincipalName), IPAddress, AppDisplayName ) on AccountUpn | where SigninTime between (MTDTime .. MTDTime + 30m) | project MTDTime, SigninTime, AccountUpn, DeviceName, SourceIP, IPAddress, AppDisplayName 5.4 Knox posture and high-risk sign-in let noncompliant = NormalizeEndpoint() | where TimeGenerated > ago(7d) | where Vendor=="Knox" and ThreatName has "NonCompliant" | project DeviceName, AccountUpn, KnoxTime=TimeGenerated noncompliant | join kind=inner ( SigninLogs | where TimeGenerated > ago(7d) | where RiskLevelDuringSignIn in ("high","medium") | project SigninTime=TimeGenerated, AccountUpn=tostring(UserPrincipalName), RiskLevelDuringSignIn, IPAddress ) on AccountUpn | where SigninTime between (KnoxTime .. KnoxTime + 2h) | project KnoxTime, SigninTime, AccountUpn, DeviceName, RiskLevelDuringSignIn, IPAddress 6. Response orchestration (SOAR) design Response should be consistent across vendors. Use a scoring model to decide whether to isolate a device, revoke tokens, or enforce Conditional Access. Prefer reversible actions, and log every automation step for audit. 6.1 Risk scoring to gate playbooks let SevScore = (s:string) { case(s=="critical",5,s=="high",4,s=="medium",2,1) } NormalizeEndpoint() | where TimeGenerated > ago(24h) | extend Score = SevScore(SeverityNormalized) | summarize RiskScore=sum(Score), Alerts=count(), Vendors=make_set(Vendor, 10) by DeviceName, AccountUpn | where RiskScore >= 8 | order by RiskScore desc High-severity playbooks typically execute: (1) isolate device via Defender (if onboarded), (2) revoke tokens in Entra ID, (3) trigger Conditional Access block, (4) notify and open ITSM ticket. Medium-severity playbooks usually tag the incident, add watchlist entries, and notify analysts.322Views8likes1CommentThreat Intelligence & Identity Ecosystem Connectors
Microsoft Sentinel’s capability can be greatly enhanced by integrating third-party threat intelligence (TI) feeds (e.g. GreyNoise, Team Cymru) with identity and access logs (e.g. OneLogin, PingOne). This article provides a detailed dive into each connector, data types, and best practices for enrichment and false-positive reduction. We cover how GreyNoise (including PureSignal/Scout), Team Cymru, OneLogin IAM, PingOne, and Keeper integrate with Sentinel – including available connectors, ingested schemas, and configuration. We then outline technical patterns for building TI-based lookup pipelines, scoring, and suppression rules to filter benign noise (e.g. GreyNoise’s known scanners), and enrich alerts with context from identity logs. We map attack chains (credential stuffing, lateral movement, account takeover) to Sentinel data, and propose KQL analytics rules and playbooks with MITRE ATT&CK mappings (e.g. T1110: Brute Force, T1595: Active Scanning). The report also includes guidance on deployment (ARM/Bicep examples), performance considerations for high-volume TI ingestion, and comparison tables of connector features. A mermaid flowchart illustrates the data flow from TI and identity sources into Sentinel analytics. All recommendations are drawn from official documentation and industry sources. Threat Intel & Identity Connectors Overview GreyNoise (TI Feed): GreyNoise provides “internet background noise” intelligence on IPs seen scanning or probing the Internet. The Sentinel GreyNoise Threat Intelligence connector (Azure Marketplace) pulls data via GreyNoise’s API into Sentinel’s ThreatIntelligenceIndicator table. It uses a daily Azure Function to fetch indicators (IP addresses and metadata like classification, noise, last_seen) and injects them as STIX-format indicators (Network IPs with provider “GreyNoise”). This feed can then be queried in KQL. Authentication requires a GreyNoise API key and a Sentinel workspace app with Contributor rights. GreyNoise’s goal is to help “filter out known opportunistic traffic” so analysts can focus on real threats. Official docs describe deploying the content pack and workbook template. Ingested data: IP-based indicators (malicious vs. benign scans), classifications (noise, riot, etc.), organization names, last-seen dates. All fields from GreyNoise’s IP lookup (e.g. classification, last_seen) appear in ThreatIntelligenceIndicator.NetworkDestinationIP, IndicatorProvider="GreyNoise", and related fields. Query: ThreatIntelligenceIndicator | where IndicatorProvider == "GreyNoise" | summarize arg_max(TimeGenerated, *) by NetworkDestinationIP This yields the latest GreyNoise record per IP. Team Cymru Scout (TI Context): Team Cymru’s PureSignal™ Scout is a TI enrichment platform. The Team Cymru Scout connector (via Azure Marketplace) ingests contextual data (not raw logs) about IPs, domains, and account usage into Sentinel custom tables. It runs via an Azure Function that, given IP or domain inputs, populates tables like Cymru_Scout_IP_Data_* and Cymru_Scout_Domain_Data_CL. For example, an IP query yields multiple tables: Cymru_Scout_IP_Data_Foundation_CL, ..._OpenPorts_CL, ..._PDNS_CL, etc., containing open ports, passive DNS history, X.509 cert info, fingerprint data, etc. This feed requires a Team Cymru account (username/password) to access the Scout API. Data types: Structured TI metadata by IP/domain. No native ThreatIndicator insertion; instead, analysts query these tables to enrich events (e.g. join on SourceIP). The Sentintel TechCommunity notes that Scout “enriches alerts with real-time context on IPs, domains, and adversary infrastructure” and can help “reduce false positives”. OneLogin IAM (Identity Logs): The OneLogin IAM solution (Microsoft Sentinel content pack) ingests OneLogin platform events and user info via OneLogin’s REST API. Using the Codeless Connector Framework, it pulls from OneLogin’s Events API and Users API, storing data in custom tables OneLoginEventsV2_CL and OneLoginUsersV2_CL. Typical events include user sign-ins, MFA actions, app accesses, admin changes, etc. Prerequisites: create an OpenID Connect app in OneLogin (for client ID/secret) and register it in Azure (Global Admin). The connector queries hourly (or on schedule), within OneLogin’s rate limit of 5000 calls/hour. Data mapping: OneLoginEventsV2_CL (CL suffix indicates custom log) holds event records (time, user, IP, event type, result, etc.); OneLoginUsersV2_CL contains user account attributes. These can be joined or used in analytics. For example, a query might look for failed login events: OneLoginEventsV2_CL | where Event_type_s == "UserSessionStart" and Result_s == "Failed" (Actual field names depend on schema.) PingOne (Identity Logs): The PingOne Audit connector ingests audit activity from the PingOne Identity platform via its REST API. It creates the table PingOne_AuditActivitiesV2_CL. This includes administrator actions, user logins, console events, etc. You configure a PingOne API client (Client ID/Secret) and set up the Codeless Connector Framework. Logs are retrieved (with attention to PingOne’s license-based rate limits) and appended to the custom table. Analysts can query, for instance, PingOne_AuditActivitiesV2_CL for events like MFA failures or profile changes. Keeper (Password Vault Logs – optional): Keeper, a password management platform, can forward security events to Sentinel via Azure Monitor. As of latest docs, logs are sent to a custom log table (commonly KeeperLogs_CL) using Azure Data Collection Rules. In Keeper’s guide, you register an Azure AD app (“KeeperLogging”) and configure Azure Monitor data collection; then in the Keeper Admin Console you specify the DCR endpoint. Keeper events (e.g. user logins, vault actions, admin changes) are ingested into the table named (e.g.) Custom-KeeperLogs_CL. Authentication uses the app’s client ID/secret and a monitor endpoint URL. This is a bulk ingest of records, rather than a scheduled pull. Data ingested: custom Keeper events with fields like user, action, timestamp. Keeper’s integration is essentially via Azure Monitor (in the older Azure Sentinel approach). Connector Configuration & Data Ingestion Authentication and Rate Limits: Most connectors require API keys or OAuth credentials. GreyNoise and Team Cymru use single keys/credentials, with the Azure Function secured by a Managed Identity. OneLogin and PingOne use client ID/secret and must respect their API limits (OneLogin ~5k calls/hour; PingOne depends on licensing). GreyNoise’s enterprise API allows bulk lookups; the community API is limited (10/day for free), so production integration requires an Enterprise plan. Sentinel Tables: Data is inserted either into built-in tables or custom tables. GreyNoise feeds the ThreatIntelligenceIndicator table, populating fields like NetworkDestinationIP and ThreatSeverity (higher if classified “malicious”). Team Cymru’s Scout connector creates many Cymru_Scout_*_CL tables. OneLogin’s solution populates OneLoginEventsV2_CL and OneLoginUsersV2_CL. PingOne yields PingOne_AuditActivitiesV2_CL. Keeper logs appear in a custom table (e.g. KeeperLogs_CL) as shown in Keeper’s guide. Note: Sentinel’s built-in identity tables (IdentityInfo, SigninLogs) are typically for Microsoft identities; third-party logs can be mapped to them via parsers or custom analytic rules but by default arrive in these custom tables. Data Types & Schema: Threat Indicators: In ThreatIntelligenceIndicator, GreyNoise IPs appear as NetworkDestinationIP with associated fields (e.g. ThreatSeverity, IndicatorProvider="GreyNoise", ConfidenceScore, etc.). (Future STIX tables may be used after 2025.) Custom CL Logs: OneLogin events may include fields such as user_id_s, user_login_s, client_ip_s, event_time, etc. (The published parser issues indicate fields like app_name_s, role_id_d, etc.) PingOne logs include eventType, user, clientIP, result. Keeper logs contain Action, UserName, etc. These raw fields can be normalized in analytic rules or parsed by data transformations. Identity Info: Although not directly ingested, identity attributes from OneLogin/PingOne (e.g. user roles, group IDs) could be periodically fetched and synced to Sentinel (via custom logic) to populate IdentityInfo records, aiding user-centric hunts. Configuration Steps : GreyNoise: In Sentinel Content Hub, install the GreyNoise ThreatIntel solution. Enter your GreyNoise API key when prompted. The solution deploys an Azure Function (requires write access to Functions) and sets up an ingestion schedule. Verify the ThreatIntelligenceIndicator table is receiving GreyNoise entries Team Cymru: From Marketplace install “Team Cymru Scout”. Provide Scout credentials. The solution creates an Azure Function app. It defines a workflow to ingest or lookup IPs/domains. (Often, analysts trigger lookups rather than scheduled ingestion, since Scout is lookup-based.) Ensure roles: the Function’s managed identity needs Sentinel contributor rights. OneLogin: Use the Data Connectors UI. Authenticate OneLogin by creating a new Sentinel Web API authentication (with OneLogin’s client ID/secret). Enable both “OneLogin Events” and “OneLogin Users”. No agent is needed. After setup, data flows into OneLoginEventsV2_CL. PingOne: Similarly, configure the PingOne connector. Use the PingOne administrative console to register an OAuth client. In Sentinel’s connector blade, enter the client ID/secret and specify desired log types (Audit, maybe IDP logs). Confirm PingOne_AuditActivitiesV2_CL populates hourly. Keeper: Register an Azure AD app (“KeeperLogging”) and assign it Monitoring roles (Publisher/Contributor) to your workspace and data collection endpoint. Create an Azure Data Collection Rule (DCR) and table (e.g. KeeperLogs_CL). In Keeper’s Admin Console (Reporting & Alerts → Azure Monitor), enter the tenant ID, client ID/secret, and the DCR endpoint URL (format: https://<DCE>/dataCollectionRules/<DCR_ID>/streams/<table>?api-version=2023-01-01). Keeper will then push logs. KQL Lookup: To enrich a Sentinel event with these feeds, you might write: OneLoginEventsV2_CL | where EventType == "UserLogin" and Result == "Success" | extend UserIP = ClientIP_s | join kind=inner ( ThreatIntelligenceIndicator | where IndicatorProvider == "GreyNoise" and ThreatSeverity >= 3 | project NetworkDestinationIP, Category ) on $left.UserIP == $right.NetworkDestinationIP This joins OneLogin sign-ins with GreyNoise’s list of malicious scanners. Enrichment & False-Positive Reduction IOC Enrichment Pipelines: A robust TI pipeline in Sentinel often uses Lookup Tables and Functions. For example, ingested TI (from GreyNoise or Team Cymru) can be stored in reference data or scheduled lookup tables to enrich incoming logs. Patterns include: - Normalization: Normalize diverse feeds into common STIX schema fields (e.g. all IPs to NetworkDestinationIP, all domains to DomainName) so rules can treat them uniformly. - Confidence Scoring: Assign a confidence score to each indicator (from vendor or based on recency/frequency). For GreyNoise, for instance, you might use classification (e.g. “malicious” vs. “benign”) and history to score IP reputation. In Sentinel’s ThreatIntelligenceIndicator.ConfidenceScore field you can set values (higher for high-confidence IOCs, lower for noisy ones). - TTL & Freshness: Some indicators (e.g. active C2 domains) expire, so setting a Time-To-Live is critical. Sentinel ingestion rules or parsers should use ExpirationDateTime or ValidUntil on indicators to avoid stale IOCs. For example, extend ValidUntil only if confidence is high. - Conflict Resolution: When the same IOC comes from multiple sources (e.g. an IP in both GreyNoise and TeamCymru), you can either merge metadata or choose the highest confidence. One approach: use the highest threat severity from any source. Sentinel’s ThreatType tags (e.g. malicious-traffic) can accommodate multiple providers. False-Positive Reduction Techniques: - GreyNoise Noise Scoring: GreyNoise’s primary utility is filtering. If an IP is labeled noise=true (i.e. just scanning, not actively malicious), rules can deprioritize alerts involving that IP. E.g. suppress an alert if its source IP appears in GreyNoise as benign scanner. - Team Cymru Reputation: Use Scout data to gauge risk; e.g. if an IP’s open port fingerprint or domain history shows no malicious tags, it may be low-risk. Conversely, known hostile IP (e.g. seen in ransomware networks) should raise alert level. Scout’s thousands of context tags help refine a binary IOC. - Contextual Identity Signals: Leverage OneLogin/PingOne context to filter alerts. For instance, if a sign-in event is associated with a high-risk location (e.g. new country) and the IP is a GreyNoise scan, flag it. If an IP is marked benign, drop or suppress. Correlate login failures: if a single IP causes many failures across multiple users, it might be credential stuffing (T1110) – but if that IP is known benign scanner, consider it low priority. - Thresholding & Suppression: Build analytic suppression rules. Example: only alert on >5 failed logins in 5 min from IP and that IP is not noise. Or ignore DNS queries to domains that TI flags as benign/whitelisted. Apply tag-based rules: some connectors allow tagging known internal assets or trusted scan ranges to avoid alerts. Use GreyNoise to suppress alerts: SecurityEvent | where EventID == 4625 and Account != "SYSTEM" | join kind=leftanti ( ThreatIntelligenceIndicator | where IndicatorProvider == "GreyNoise" and Classification == "benign" | project NetworkSourceIP ) on $left.IPAddress == $right.NetworkSourceIP This rule filters out Windows 4625 login failures originating from GreyNoise-known benign scanners. Identity Attack Chains & Detection Rules Modern account attacks often involve sequential activities. By combining identity logs with TI, we can detect advanced patterns. Below are common chains and rule ideas: Credential Stuffing (MITRE T1110): Often seen as many login failures followed by a success. Detection: Look for multiple failed OneLogin/PingOne sign-ins for the same or different accounts from a single IP, then a success. Enrich with GreyNoise: if the source IP is in GreyNoise (indicating scanning), raise severity. Rule: let SuspiciousIP = OneLoginEventsV2_CL | where EventType == "UserSessionStart" and Result == "Failed" | summarize CountFailed=count() by ClientIP_s | where CountFailed > 5; OneLoginEventsV2_CL | where EventType == "UserSessionStart" and Result == "Success" and ClientIP_s in (SuspiciousIP | project ClientIP_s) | join kind=inner ( ThreatIntelligenceIndicator | where ThreatType == "ip" | extend GreyNoiseClass = tostring(Classification) | project IP=NetworkSourceIP, GreyNoiseClass ) on $left.ClientIP_s == $right.IP | where GreyNoiseClass == "malicious" | project TimeGenerated, Account_s, ClientIP_s, GreyNoiseClass Tactics: Initial Access (T1110) – Severity: High. Account Takeover / Impossible Travel (T1198): Sign-ins from unusual geographies or devices. Detection: Compare user’s current sign-in location against historical baseline. Use OneLogin/PingOne logs: if two logins by same user occur in different countries with insufficient time to travel, trigger. Enrich: if the login IP is also known infrastructure (Team Cymru PDNS, etc.), raise alert. Rule: PingOne_AuditActivitiesV2_CL | where EventType_s == "UserLogin" | extend loc = tostring(City_s) + ", " + tostring(Country_s) | sort by TimeGenerated desc | partition by User_s ( where TimeGenerated < ago(24h) // check last day | summarize count(), min(TimeGenerated), max(TimeGenerated) ) | where max_TimeGenerated - min_TimeGenerated < 1h and count_>1 and (range(loc) contains ",") | project User_s, TimeGenerated, loc (This pseudo-query checks multiple locations in <1 hour.) Tactics: Reconnaissance / Initial Access – Severity: Medium. Lateral Movement (T1021): Use of an account on multiple systems/apps. Detection: Two or more distinct application/service authentications by same user within a short time. Use OneLogin app-id fields or audit logs for access. If these are followed by suspicious network activity (e.g. contacting C2 via GreyNoise), escalate. Tactics: Lateral Movement – Severity: High. Privilege Escalation (T1098): If an admin account is changed or MFA factors reset in OneLogin/PingOne, especially after anomalous login. Detection: Monitor OneLogin admin events (“User updated”, “MFA enrolled/removed”). Cross-check the actor’s IP against threat feeds. Tactics: Credential Access – Severity: High. Analytics Rules (KQL) Below are six illustrative Sentinel analytics rules combining TI and identity logs. Each rule shows logic, tactics, severity, and MITRE IDs. (Adjust field names per your schemas and normalize CL tables as needed.) Multiple Failed Logins from Malicious Scanner (T1110) – High severity. Detect credential stuffing by identifying >5 failed login attempts from the same IP, where that IP is classified as malicious by GreyNoise. let BadIP = OneLoginEventsV2_CL | where EventType == "UserSessionStart" and Result == "Failed" | summarize attempts=count() by SourceIP_s | where attempts >= 5; OneLoginEventsV2_CL | where EventType == "UserSessionStart" and Result == "Success" and SourceIP_s in (BadIP | project SourceIP_s) | join ( ThreatIntelligenceIndicator | where IndicatorProvider == "GreyNoise" and ThreatSeverity >= 4 | project MaliciousIP=NetworkDestinationIP ) on $left.SourceIP_s == $right.MaliciousIP | extend AttackFlow="CredentialStuffing", MITRE="T1110" | project TimeGenerated, UserName_s, SourceIP_s, MaliciousIP Logic: Correlate failed-then-success login from same IP plus GreyNoise-malign classification. Impossible Travel / Anomalous Geo (T1198) – Medium severity. A user signs in from two distant locations within an hour. // Get last two logins per user let lastLogins = PingOne_AuditActivitiesV2_CL | where EventType_s == "UserLogin" and Outcome_s == "Success" | sort by TimeGenerated desc | summarize first_place=arg_max(TimeGenerated, City_s, Country_s, SourceIP_s, TimeGenerated) by User_s; let prevLogins = PingOne_AuditActivitiesV2_CL | where EventType_s == "UserLogin" and Outcome_s == "Success" | sort by TimeGenerated desc | summarize last_place=arg_min(TimeGenerated, City_s, Country_s, SourceIP_s, TimeGenerated) by User_s; lastLogins | join kind=inner prevLogins on User_s | extend dist=geo_distance_2points(first_place_City_s, first_place_Country_s, last_place_City_s, last_place_Country_s) | where dist > 1000 and (first_place_TimeGenerated - last_place_TimeGenerated) < 1h | project Time=first_place_TimeGenerated, User=User_s, From=last_place_Country_s, To=first_place_Country_s, MITRE="T1198" Logic: Compute geographic distance between last two logins; flag if too far too fast. Suspicious Admin Change (T1098) – High severity. Detect a change to admin settings (like role assign or MFA reset) via PingOne, from a high-risk IP (Team Cymru or GreyNoise) or after failed logins. PingOne_AuditActivitiesV2_CL | where EventType_s in ("UserMFAReset", "UserRoleChange") // example admin events | extend ActorIP = tostring(InitiatingIP_s) | join ( ThreatIntelligenceIndicator | where ThreatSeverity >= 3 | project BadIP=NetworkDestinationIP ) on $left.ActorIP == $right.BadIP | extend MITRE="T1098" | project TimeGenerated, ActorUser_s, Action=EventType_s, ActorIP Logic: Raise if an admin action originates from known bad IP. Malicious Domain Access (T1498): Medium severity. Internal logs (e.g. DNS or Web proxy) show access to a domain listed by Team Cymru Scout as C2 or reconnaissance. DeviceDnsEvents | where QueryType == "A" | join kind=inner ( Cymru_Scout_Domain_Data_CL | where ThreatTag_s == "Command-and-Control" | project DomainName_s ) on $left.QueryText == $right.DomainName_s | extend MITRE="T1498" | project TimeGenerated, DeviceName, QueryText Logic: Correlate internal DNS queries with Scout’s flagged C2 domains. (Requires that domain data is ingested or synced.) Brute-Force Firewall Blocked IP (T1110): Low to Medium severity. Firewall logs show an IP blocked for many attempts, and that IP is not noise per GreyNoise (i.e., malicious scanner). AzureDiagnostics | where Category == "NetworkSecurityGroupFlowEvent" and msg_s contains "DIRECTION=Inbound" and Action_s == "Deny" | summarize attemptCount=count() by IP = SourceIp_s, FlowTime=bin(TimeGenerated, 1h) | where attemptCount > 50 | join kind=leftanti ( ThreatIntelligenceIndicator | where IndicatorProvider == "GreyNoise" and Classification == "benign" | project NoiseIP=NetworkDestinationIP ) on $left.IP == $right.NoiseIP | extend MITRE="T1110" | project IP, attemptCount, FlowTime Logic: Many inbound denies (possible brute force) from an IP not whitelisted by GreyNoise. New Device Enrolled (T1078): Low severity. A user enrolls a new device or location for MFA after unusual login. OneLoginEventsV2_CL | where EventType == "NewDeviceEnrollment" | join kind=inner ( OneLoginEventsV2_CL | where EventType == "UserSessionStart" and Result == "Success" | top 1 by TimeGenerated asc // assume prior login | project User_s, loginTime=TimeGenerated, loginIP=ClientIP_s ) on User_s | where loginIP != DeviceIP_s | extend MITRE="T1078" | project TimeGenerated, User_s, DeviceIP_s, loginIP Logic: Flag if new device added (strong evidence of account compromise). Note: The above rules are illustrative. Tune threshold values (e.g. attempt counts) to your environment. Map the event fields (EventType, Result, etc.) to your actual schema. Use Severity mapping in rule configs as indicated and tag with MITRE IDs for context. TI-Driven Playbooks and Automation Automated response can amplify TI. Patterns include: - IOC Blocking: On alert (e.g. suspicious IP login), an automation runbook can call Azure Firewall, Azure Defender, or external firewall APIs to block the offending IP. For instance, a Logic App could trigger on the analytic alert, use the TI feed IP, and call AzFWNetworkRule PowerShell to add a deny rule. - Enrichment Workflow: After an alert triggers, an Azure Logic App playbook can enrich the incident by querying TI APIs. E.g., given an IP from the alert, call GreyNoise API or Team Cymru Scout API in real-time (via HTTP action), add the classification into incident details, and tag the incident accordingly (e.g. GreyNoiseStatus: malicious). This adds context for the analyst. - Alert Suppression: Implement playbook-driven suppression. For example, an alert triggered by an external IP can invoke a playbook that checks GreyNoise; if the IP is benign, the playbook can auto-close the alert or mark as false-positive, reducing analyst load. - Automated TI Feed Updates: Periodically fetch open-source or commercial TI and use a playbook to push new indicators into Sentinel’s TI store via the Graph API. - Incident Enrichment: On incident creation, a playbook could query OneLogin/PingOne for additional user details (like department or location via their APIs) and add as note in the incident. Performance, Scalability & Costs TI feeds and identity logs can be high-volume. Key considerations: - Data Ingestion Costs: Every log and TI indicator ingested into Sentinel is billable by the GB. Bulk TI indicator ingestion (like GreyNoise pulling thousands of IPs/day) can add storage costs. Use Sentinel’s Data Collection Rules (DCR) to apply ingestion-time filters (e.g. only store indicators above a confidence threshold) to reduce volume. GreyNoise feed is typically modest (since it’s daily, maybe thousands of IPs). Identity logs (OneLogin/PingOne) depend on org size – could be megabytes per day. Use sentinel ingestion sl analytic filters to drop low-value logs. - Query Performance: Custom log tables (OneLogin, PingOne, KeeperLogs_CL) can grow large. Periodically archive old data (e.g. export >90 days to storage, then purge). Use materialized views or scheduled summary tables for heavy queries (e.g. pre-aggregate hourly login counts). For threat indicator tables, leverage built-in indices on IndicatorId and NetworkIP for fast joins. Use project-away _* to remove metadata from large join queries. - Retention & Storage: Configure retention per table. If historical TI is less needed, set shorter retention. Use Azure Monitor’s tiering/Archive for seldom-used data. For large TI volumes (e.g. feeding multiple TIPs), consider using Sentinel Data Lake (or connecting Log Analytics to ADLS Gen2) to offload raw ingest cheaply. - Scale-Out Architecture: For very large environments, use multiple Sentinel workspaces (e.g. regional) and aggregate logs via Azure Lighthouse or Sentinel Fusion. TI feeds can be shared: one workspace collects TI, then distribute to others via Azure Sentinel’s TI management (feeds can be published and shared cross-workspaces). - Connector Limits: API rate limits dictate update frequency. Schedule connectors accordingly (e.g. daily for TI, hourly for identity events). Avoid hourly pulls of already static data (users list can be daily). For OneLogin/PingOne, use incremental tokens or webhooks if possible to reduce load. - Monitoring Health: Use Sentinel’s Log Analytics and Monitor metrics to track ingestion volume and connector errors. For example, monitor the Functions running GreyNoise/Scout for failures or throttling. Deployment Checklist & Guide Prepare Sentinel Workspace: Ensure a Log Analytics workspace with Sentinel enabled. Record the workspace ID and region. Register Applications: In Azure AD, create and note any Service Principal needed for functions or connectors (e.g. a Sentinel-managed identity for Functions). In each vendor portal, register API apps and credentials (OneLogin OIDC App, PingOne API client, Keeper AD app). Network & Security: If needed, configure firewall rules to allow outbound to vendor APIs. Install Connectors: In Sentinel Content Hub or Marketplace, install the solutions for GreyNoise TI, Team Cymru Scout, OneLogin IAM, PingOne. Follow each wizard to input credentials. Verify the “Data Types” (Logs, Alerts, etc.) are enabled. Create Tables & Parsers (if manual): For Keeper or unsupported logs, manually create custom tables (via DCR in Azure portal). Import JSON to define fields as shown in Keeper’s docs Test Data Flow: After each setup, wait 1–24 hours and run a simple query on the destination table (e.g. OneLoginEventsV2_CL | take 5) to confirm ingestion. Deploy Ingestion Rules: Use Sentinel Threat intelligence ingestion rules to fine-tune feeds (e.g. mark high-confidence feeds to extend expiration). Optionally tag/whitelist known good. Configure Analytics: Enable or create rules using the KQL above. Place them in the correct threat hunting or incident rule categories (Credential Access, Lateral Movement, etc.). Assign appropriate alert severity. Set up Playbooks: For automated actions (alert enrichment, IOC blocking), create Logic App playbooks. Test with mock alerts (dry run) to ensure correct API calls. Tuning & Baseline: After initial alerts, tune queries (thresholds, whitelists) to reduce noise. Maintain suppression lists (e.g. internal pentest IPs). Use the MITRE mapping in rule details for clarity. Documentation & Training: Document field mappings (e.g. OneLoginEvents fields), and train SOC staff on new TI-enriched alert fields. Connectors Comparison Connector Data Sources Sent. Tables Update Freq. Auth Method Key Fields Enriched Limits/Cost Pros/Cons GreyNoise IP intelligence (scanners) ThreatIntelligenceIndicator Daily (scheduled pull) API Key IP classification, noise, classification API key required; paid license for large usage Pros: Filters benign scans, broad scan visibility Con: Only IP-based (no domain/file). Team Cymru Scout Global IP/domain telemetry Cymru_Scout_*_CL (custom tables) On-demand or daily Account credentials Detailed IP/domain context (ports, PDNS, ASN, etc.) Requires Team Cymru subscription. Potentially high cost for feed. Pros: Rich context (open ports, DNS, certs); great for IOC enrichment. Con: Complex setup, data in custom tables only. OneLogin IAM OneLogin user/auth logs OneLoginEventsV2_CL, OneLoginUsersV2_CL Polls hourly OAuth2 (client ID/secret) User, app, IP, event type (login, MFA, etc.) OneLogin API: 5K calls/hour. Data volume moderate. Pros: Direct insight into cloud identity use; built-in parser available. Cons: Limited to OneLogin environment only. PingOne Audit PingOne audit logs PingOne_AuditActivitiesV2_CL Polls hourly OAuth2 (client ID/secret) User actions, admin events, MFA logs Rate limited by Ping license. Data volume moderate. Pros: Captures critical identity events; widely used product. Cons: Requires PingOne Advanced license for audit logs. Keeper (custom) Keeper security events KeeperLogs_CL (custom) Push (continuous) OAuth2 (client ID/secret) + Azure DCR Vault logins, record accesses, admin changes None (push model); storage costs. Pros: Visibility into password vault activity (often blind spot). Cons: More manual setup; custom logs not parsed by default. Data Flow Diagram This flowchart shows GreyNoise (GN) feeding the Threat Intelligence table, Team Cymru feeding enrichment tables, and identity sources pushing logs. All data converges into Sentinel, where enrichment lookups inform analytics and automated responses.269Views8likes0Comments