abhisheksharan
13 TopicsDetecting and Alerting on MDE Sensor Health Transitions Using KQL and Logic Apps
Introduction Maintaining the health of Microsoft Defender for Endpoint (MDE) sensors is essential for ensuring continuous security visibility across your virtual machine (VM) infrastructure. When a sensor transitions from an "Active" to an "Inactive" state, it indicates a loss of telemetry from that device and potentially creating blind spots in your security posture. To proactively address this risk, it's important to detect these transitions promptly and alert your security team for timely remediation. This guide walks you through a practical approach to automate this process using a Kusto Query Language (KQL) script to identify sensor health state changes, and an Azure Logic App to trigger email alerts. By the end, you'll have a fully functional, automated monitoring solution that enhances your security operations with minimal manual effort. Why Monitoring MDE Sensor Health Transitions is Important Ensures Continuous Security Visibility MDE sensors provide critical telemetry data from endpoints. If a sensor becomes inactive, that device stops reporting, creating a blind spot in your security monitoring. Prevents Delayed Threat Detection Inactive sensors can delay the identification of malicious activity, giving attackers more time to operate undetected within your environment. Supports Effective Incident Response Without telemetry, incident investigations become harder and slower, reducing your ability to respond quickly and accurately to threats. Identifies Root Causes Early Monitoring transitions helps uncover underlying issues such as service disruptions, misconfigurations, or agent failures that may otherwise go unnoticed. Closes Security Gaps Proactively Early detection of inactive sensors allows teams to take corrective action before adversaries exploit the lapse in coverage. Enables Automation and Scalability Using KQL and Logic Apps automates the detection and alerting process, reducing manual effort and ensuring consistent monitoring across large environments. Improves Operational Efficiency Automated alerts reduce the need for manual checks, freeing up security teams to focus on higher-priority tasks. Strengthens Overall Security Posture Proactive monitoring and fast remediation contribute to a more resilient and secure infrastructure. Prerequisites MDE Enabled: Defender for Endpoint must be active and reporting on all relevant devices. Stream DeviceInfo table (from Defender XDR connector) in Microsoft Sentinel’s workspace: Required to run KQL queries and manage alerts. Log Analytics Workspace: To run the KQL query. Azure Subscription: Needed to create and manage Logic Apps. Permissions: Sufficient RBAC access to Logic Apps, Log Analytics, and email connectors. Email Connector Setup: Outlook, SendGrid, or similar must be configured in Logic Apps. Basic Knowledge: Familiarity with KQL and Logic App workflows is helpful. High-level summary of the Logic Apps flow for monitoring MDE sensor health transitions: Trigger: Recurrence The Logic App starts on a scheduled basis (e.g., weekly or daily or hourly) using a recurrence trigger. Action: Run KQL Query Executes a Kusto Query against the Log Analytics workspace to detect devices where the MDE sensor transitioned from Active to Inactive in the last 7 days. Condition (Optional): Check for Results Optionally checks if the query returned any results to avoid sending empty alerts. Action: Send Email Notification If results are found, an email is sent to the security team with details of the affected devices using dynamic content from the query output. Logic Apps Flow KQL Query to Detect Sensor Transitions Use the following KQL query in Microsoft Defender XDR or Microsoft Sentinel to identify VMs where the sensor health state changed from Active to Inactive in the last 7 days: DeviceInfo | where Timestamp >= ago(7d) | project DeviceName, DeviceId, Timestamp, SensorHealthState | sort by DeviceId asc, Timestamp asc | serialize | extend PrevState = prev(SensorHealthState) | where PrevState == "Active" and SensorHealthState == "Inactive" | summarize FirstInactiveTime = min(Timestamp) by DeviceName, DeviceId | extend DaysInactive = datetime_diff('day', now(), FirstInactiveTime) | order by FirstInactiveTime desc This KQL query does the following: Detects devices whose sensors have stopped functioning (changed from Active to Inactive) in the past 7 days. Provides the first time this happened for each affected device. It also tells you how long each device has been inactive. Sample Email for reference How This Helps the Security Team Maintains Endpoint Visibility Detects when devices stop reporting telemetry, helping prevent blind spots in threat detection. Enables Proactive Threat Management Identifies sensor health issues before they become security incidents, allowing early intervention. Reduces Manual Monitoring Effort Automates the detection and alerting process, freeing up analysts to focus on higher-priority tasks. Improves Incident Response Readiness Ensures all endpoints are actively monitored, which is critical for timely and accurate incident investigations. Supports Compliance and Audit Readiness Demonstrates continuous monitoring and control over endpoint health, which is often required for regulatory compliance. Prioritizes Remediation Efforts Provides a clear list of affected devices, helping teams focus on the most recent or longest inactive endpoints. Integrates with Existing Workflows Can be extended to trigger ticketing systems, remediation scripts, or SIEM alerts, enhancing operational efficiency. Conclusion By combining KQL analytics with Azure Logic Apps, you can automate the detection and notification of sensor health issues in your VM fleet, ensuring continuous security coverage and rapid response to potential risks.Optimize Azure Log Costs: Split Tables and Use the Auxiliary Tier with DCR
This blog is continuation of my previous blog where I discussed about saving ingestion costs by splitting logs into multiple tables and opting for the basic tier! Now that the transformation feature for Auxiliary logs has entered Public Preview stage, I’ll take a deeper dive, showing how to implement transformations to split logs across tables and route some of them to the Auxiliary tier. A quick refresher: Azure Monitor offers several log plans which our customers can opt for depending on their use cases. These log plans include: Analytics Logs – This plan is designed for frequent, concurrent access and supports interactive usage by multiple users. This plan drives the features in Azure Monitor Insights and powers Microsoft Sentinel. It is designed to manage critical and frequently accessed logs optimized for dashboards, alerts, and business advanced queries. Basic Logs – Improved to support even richer troubleshooting and incident response with fast queries while saving costs. Now available with a longer retention period and the addition of KQL operators to aggregate and lookup. Auxiliary Logs – Our new, inexpensive log plan that enables ingestion and management of verbose logs needed for auditing and compliance scenarios. These may be queried with KQL on an infrequent basis and used to generate summaries. Following diagram provides detailed information about the log plans and their use cases: More details about Azure Monitor Logs can be found here: Azure Monitor Logs - Azure Monitor | Microsoft Learn **Note** This blog will be focussed on switching to Auxiliary logs only. I would recommend going through our public documentation for detailed insights about feature-wise comparison for the log plans which should help you in taking right decisions for choosing the correct log plans. At this stage, I assume you’re aware about different log tiers that Azure Monitor offers and you’ve decided to switch to Auxiliary logs for high volume, low-fidelity logs. Let’s look at the high-level approach we’re going to follow to achieve this: Review the relevant tables and figure out which portion of the log can be moved to Auxiliary tier Create a DCR-based custom table which same schema as of the original table. For Ex. If you wish to split Syslog table and ingest a portion of the table into Auxiliary tier, then create a DCR-based custom table with same schema as of the Syslog table. At this point, switching table plan via UI is not possible, so I’d recommend using PowerShell script to create the DCR-based custom table. Once DCR-based custom table is created, implement DCR transformation to split the table. Configure total retention period of the Auxiliary table (this configuration will be done while creating the table) Let’s get started Use Case: In this demo, I’ll split Syslog table and route “Informational” logs to the Auxiliary table. Creating a DCR-based custom table: Previously a complex task, creating custom tables is now easy, thanks to a PowerShell script by MarkoLauren. Simply input the name of an existing table, and the script creates a DCR-based custom table with the same schema. Let’s see it in action now: Download the script locally. Update the resourceID details in this script and save it. Upload the updated script in Azure Shell. Load the file and enter the table name from which you wish to copy the schema. In my case, it's going to be "Syslog" table. Enter new table name, table type and total retention period, shown below: **Note** We highly recommend you review the PowerShell script thoroughly and do proper testing before executing it in production. We don't take any responsibility for the script. As you can see, Aux_Syslog_CL table has been created. Let’s validate in log analytics workspace > table section. Since the Auxiliary table has been created now, next step is to implement transformation logic at data collection rule level. The next step is to update the Data Collection Rule template to split the logs Since we already created custom table, we should create a transformation logic to split the Syslog table and route the logs with SeverityLevel “info” to the Auxiliary table. Let’s see how it works: Browse to Data Collection Rule blade. Open the DCR for Syslog table, click on Export template > Deploy > Edit Template as shown below: In the dataFlows section, I’ve created 2 streams for splitting the logs. Details about the streams as follows: 1 st Stream: It’s going to drop the Syslog messages where SeverityLevel is “info” and send rest of the logs to Syslog table. 2 nd Stream: It’s going to capture all Syslog messages where SeverityLevel is “info” and send the logs to Aux_Syslog_CL table. Save and deploy the updated template. Let’s see if it works as expected Browse to Azure > Microsoft Sentinel > Logs; and query the Auxiliary table to confirm if data is being ingested into this table. As we can see, the logs where SeverityLevel is “info” is being ingested in the Aux_Syslog_CL table and rest of the logs are flowing into Syslog table. Some nice cost savings are coming your way, hope this helps!Using Microsoft Sentinel MCP Server with GitHub Copilot for AI-Powered Threat Hunting
Introduction This post walks through how to get started with the Microsoft Sentinel MCP Server and showcases a hands-on demo integrating with Visual Studio Code and GitHub Copilot. Using the MCP server, you can run natural language queries against Microsoft Sentinel’s security data lake, enabling faster investigations and simplified threat hunting using tools you already know. This blog includes a real-world prompt you can use in your own environment and highlights the power of AI-assisted security workflows. What is the Microsoft Sentinel MCP Server? The Model Context Protocol (MCP) allows AI models to access structured security data in a standard, context-aware way. The Sentinel MCP server connects to your Microsoft Sentinel data lake and enables tools like GitHub Copilot or Security Copilot to: Search security data using natural language Summarize findings and explain risks Build intelligent agents for security operations Prerequisites Make sure you have the following in place: Onboarded to Microsoft Sentinel Data Lake Assigned the Security Reader role Installed: Visual Studio Code GitHub Copilot extension (Optional) Security Copilot plugin if building agents Setting Up MCP Server in VS Code Step 1: Add the MCP Server In VS Code, press Ctrl + Shift + P Search for: MCP: Add Server Choose HTTP or Server-Sent Events Enter one of the following MCP endpoints: Use Case Endpoint Data Exploration https://sentinel.microsoft.com/mcp/data-exploration Agent Creation https://sentinel.microsoft.com/mcp/security-copilot-agent-creation Give the server a friendly name (e.g., Sentinel MCP Server) Choose whether to apply it to all workspaces or just the current one When prompted, Allow authentication using an account with Security Reader access Verify the Connection Open Chat: View > Chat or Ctrl + Alt + I Switch to Agent Mode Click the Configure Tools icon to ensure MCP tools are active Using GitHub Copilot + Sentinel MCP Once connected, you can use natural language prompts to pull insights from your Sentinel data lake without writing any KQL. Demo Prompt: 🔍 “Find the top three users that are at risk and explain why they are at risk.” This prompt is designed to: Identify the highest-risk users in your environment Explain the reasoning behind each user's risk status Help prioritize investigation and response efforts You can enter this prompt in either: VS Code Chat window (Agent Mode) Copilot inline prompt area Expected Behavior The MCP server will: Query multiple Microsoft Sentinel sources (Identity Protection, Defender for Identity, Sign-in logs) Correlate risk events (e.g., risky sign-ins, alerts, anomalies) Return a structured response with top users and risk explanation Sample Output from My Tenant Results Found: User 1: 233 risk score - 53 failed attempts from suspicious IPs User 2: 100% failure rate indicating service account compromise User 3: Admin account under targeted brute force attack This demo shows how the integration of Microsoft Sentinel MCP Server with GitHub Copilot and VS Code transforms complex security investigations into simple, conversational workflows. By leveraging natural language and AI-driven context, we can surface high-risk users, understand the underlying threats, and take action — all within a familiar development environment, and without writing a single line of KQL. More details here: What is Microsoft Sentinel’s support for MCP? (preview) - Microsoft Security | Microsoft Learn Get started with Microsoft Sentinel MCP server - Microsoft Security | Microsoft Learn Data exploration tool collection in Microsoft Sentinel MCP server - Microsoft Security | Microsoft LearnFrom Healthy to Unhealthy: Alerting on Defender for Cloud Recommendations with Logic Apps
In today's cloud-first environments, maintaining strong security posture requires not just visibility but real-time awareness of changes. This blog walks you through a practical solution to monitor and alert on Microsoft Defender for Cloud recommendations that transition from Healthy to Unhealthy status. By combining the power of Kusto Query Language (KQL) with the automation capabilities of Azure Logic Apps, you’ll learn how to: Query historical and current security recommendation states using KQL Detect resources that have degraded in compliance over the past 14 days Send automatic email alerts when issues are detected Customize the email content with HTML tables for easy readability Handle edge cases, like sending a “no issues found” email when nothing changes Whether you're a security engineer, cloud architect, or DevOps practitioner, this solution helps you close the gap between detection and response and ensure that no security regressions go unnoticed. Prerequisites Before implementing the monitoring and alerting solution described in this blog, ensure the following prerequisites are met: Microsoft Defender for Cloud is Enabled Defender for Cloud must be enabled on the target Azure subscriptions/management group. It should be actively monitoring your resources (VMs, SQL, App Services, etc.). Make sure the recommendations are getting generated. Continuous Export is Enabled for Security Recommendations Continuous export should be configured to send security recommendations to a Log Analytics workspace. This enables you to query historical recommendation state using KQL. You can configure continuous export by going to: Defender for Cloud → Environment settings → Select Subscription → Continuous Export Then enable export for Security Recommendations to your chosen Log Analytics workspace. Detailed guidance on setting up continuous export can be found here: Set up continuous export in the Azure portal - Microsoft Defender for Cloud | Microsoft Learn High-Level Summary of the Automation Flow This solution provides a fully automated way to track and alert on security posture regressions in Microsoft Defender for Cloud. By integrating KQL queries with Azure Logic Apps, you can stay informed whenever a resource's security recommendation changes from Healthy to Unhealthy. Here's how the flow works: Microsoft Defender for Cloud evaluates Azure resources and generates security recommendations based on best practices and potential vulnerabilities. These recommendations are continuously exported to a Log Analytics workspace, enabling historical analysis over time. A scheduled Logic App runs a KQL query that compares: Recommendations from ~14 days ago (baseline), With those from the last 7 days (current state). If any resources are found to have shifted from Healthy to Unhealthy, the Logic App: Formats the data into an HTML table, and Sends an email alert with the affected resource details and recommendation metadata. If no such changes are found, an optional email can be sent stating that all monitored resources remain compliant — providing peace of mind and audit trail coverage. This approach enables teams to proactively monitor security drift, reduce manual oversight, and ensure timely remediation of emerging security issues. Logic Apps Flow This Logic App is scheduled to trigger daily. It runs a KQL query against a Log Analytics workspace to identify resources that have changed from Healthy to Unhealthy status over the past two weeks. If such changes are detected, the results are formatted into an HTML table and emailed to the security team for review and action. KQL Query used here: // Get resources that are currently unhealthy within the last 7 days let now_unhealthy = SecurityRecommendation | where TimeGenerated > ago(7d) | where RecommendationState == "Unhealthy" // For each resource and recommendation, get the latest record | summarize arg_max(TimeGenerated, *) by AssessedResourceId, RecommendationDisplayName; // Get resources that were healthy approximately 14 days ago (between 12 and 14 days ago) let past_healthy = SecurityRecommendation | where TimeGenerated between (ago(14d) .. ago(12d)) | where RecommendationState == "Healthy" // For each resource and recommendation, get the latest record in that time window | summarize arg_max(TimeGenerated, *) by AssessedResourceId, RecommendationDisplayName; // Join current unhealthy resources with their healthy state 14 days ago now_unhealthy | join kind=inner past_healthy on AssessedResourceId, RecommendationDisplayName | project AssessedResourceId, // Unique ID of the assessed resource RecommendationDisplayName, // Name of the security recommendation RecommendationSeverity, // Severity level of the recommendation Description, // Description explaining the recommendation State_14DaysAgo = RecommendationState1,// Resource state about 14 days ago (should be "Healthy") State_Recent = RecommendationState, // Current resource state (should be "Unhealthy") Timestamp_14DaysAgo = TimeGenerated1, // Timestamp from ~14 days ago Timestamp_Recent = TimeGenerated // Most recent timestamp Once this logic app executes successfully, you’ll get an email as per your configuration. This email includes: A brief introduction explaining the situation. The number of affected recommendations. A formatted HTML table with detailed information: AssessedResourceId: The full Azure resource ID. RecommendationDisplayName: What Defender recommends (e.g., “Enable MFA”). Severity: Low, Medium, High. Description: What the recommendation means and why it matters. State_14DaysAgo: The previous (Healthy) state. State_Recent: The current (Unhealthy) state. Timestamps: When the states were recorded. Sample Email for reference: What the Security Team Can Do with It? Review the Impact Quickly identify which resources have degraded in security posture. Assess if the changes are critical (e.g., exposed VMs, missing patching). Prioritize Remediation Use the severity level to triage what needs immediate attention. Assign tasks to the right teams — infrastructure, app owners, etc. Correlate with Other Alerts Cross-check with Microsoft Sentinel, vulnerability scanners, or SIEM rules. Investigate whether these changes are expected, neglected, or malicious. Track and Document Use the email as a record of change in security posture. Log it in ticketing systems (like Jira or ServiceNow) manually or via integration. Optional Step: Initiate Remediation Playbooks Based on the resource type and issue, teams may: Enable security agents, Update configurations, Apply missing patches, Isolate the resource (if necessary). Automating alerts for resources that go from Healthy to Unhealthy in Defender for Cloud makes life a lot easier for security teams. It helps you catch issues early, act faster, and keep your cloud environment safe without constantly watching dashboards. Give this Logic App a try and see how much smoother your security monitoring and response can be! Access the JSON deployment file for this Logic App here: Microsoft-Unified-Security-Operations-Platform/Microsoft Defender for Cloud/ResourcesMovingFromHealthytoUnhealthyState/ARMTemplate-HealthytoUnhealthyResources(MDC).json at main · Abhishek-Sharan/Microsoft-Unified-Security-Operations-Platform