Blog Post

Microsoft Sentinel Blog
4 MIN READ

Monitoring Microsoft Sentinel Analytical Rules – Push Health Notifications

Sreedhar_Ande's avatar
Sreedhar_Ande
Icon for Microsoft rankMicrosoft
Sep 29, 2021

Microsoft Sentinel Analytical rules help Security Teams discover threats and anomalous behaviors to ensure full security coverage for your environment

 

After connecting our data sources to Microsoft Sentinel, first we enable Analytical rules. Each data source comes with built-in, out-of-the-box templates to create threat detection rules.

 

Analytics rules search for specific events or sets of events across your environment, alert you when certain event thresholds or conditions are reached, generate incidents for SOC to triage and investigate, and respond to threats with automated tracking and remediation processes.

 

Scenario: A scheduled rule failed to execute, or appears with AUTO DISABLED added to the name

It's a rare occurrence that a scheduled query rule fails to run, but it can happen. As shown in the image below, a customer had located several Scheduled Analytics Rules that had been Auto-disable in their environment.

 

 

Microsoft Sentinel classifies failures up front as either transient or permanent, based on the specific type of the failure and the circumstances that led to it.

 

Transient failure

 

A transient failure occurs due to a circumstance which is temporary and will soon return to normal, at which point the rule execution will succeed. Some examples of failures that Microsoft Sentinel classifies as transient are:

  • A rule query takes too long to run and times out.
  • Connectivity issues between data sources and Log Analytics, or between Log Analytics and Microsoft Sentinel.
  • Any other new and unknown failure is considered transient.

In the event of a transient failure, Microsoft Sentinel continues trying to execute the rule again after predetermined and ever-increasing intervals, up to a point. After that, the rule will run again only at its next scheduled time. A rule will never be auto-disabled due to a transient failure.

 

Permanent failure

 

A permanent failure occurs due to a change in the conditions that allow the rule to run, which without human intervention will not return to their former status. The following are some examples of failures that are classified as permanent:

  • The target workspace (on which the rule query operated) has been deleted.
  • The target table (on which the rule query operated) has been deleted.
  • Microsoft Sentinel had been removed from the target workspace.
  • A function used by the rule query is no longer valid; it has been either modified or removed.
  • Permissions to one of the data sources of the rule query were changed.
  • One of the data sources of the rule query was deleted or disconnected.

In the event of a predetermined number of consecutive permanent failures, of the same type and on the same rule, Microsoft Sentinel stops trying to execute the rule, and takes the following steps:

  • Disables the rule.
  • Adds the words "AUTO DISABLED" to the beginning of the rule's name.
  • Adds the reason for the failure (and the disabling) to the rule's description.

It's a rare occurrence that a scheduled query rule gets auto-disabled, but it can happen. When it happens, following are the challenges for SOC to triage and investigate, and respond to threats with automated tracking and remediation processes

  • Alerts/Incidents will not be generated.
  • Automated threat responses (Automation Rules/Playbooks) for your rules will not be triggered.

As of today, SOC Managers/SOC Analysts check the rule list regularly for the presence of auto-disabled rules manually. When it happens, there is no easy way to determine the presence of any auto-disabled rules automatically.

 

There has been a need for a solution that will notify SOC Managers/SOC Analysts when a scheduled analytic rule has been auto-disabled. This blog is going to detail how to monitor Microsoft Sentinel Analytic rules periodically and send notification immediately to the SOC Team via email or Teams post in case of any analytic rules gets auto-disabled via this Playbook.

Deployment

This section explains how to use the ARM template to deploy the playbook to get notifications when an Microsoft Sentinel Analytic rule gets auto-disabled.

To access the ARM template, navigate to this Playbook

  1. Click the Deploy to Azure/Deploy to Azure Gov Button:
  2. Enter values for the following parameters.
    • "Microsoft Sentinel Workspace Name": Azure Log Analytics Workspace Name​
    • “Microsoft Sentinel Workspace Resource Group": Microsoft Sentinel Workspace Resource Group Name
    • "Mailing List": Email Ids separated by semi colon (;)
    • "Teams Id": Microsoft Teams Id
    • "Channel Id": Microsoft Teams Channel Id
  3. Click “Review & Create”, after successful validation click on create

Playbook Components

This section explains trigger and actions inside the workflow:

Post Deployment

This section explains steps to perform after successful deployment:

1. Authorize API Connections - used to connect Logic Apps to SaaS services, such as Office 365 & Teams

2. This playbook uses Managed Identity which grants permissions by using Azure role-based access control (Azure RBAC). The managed identity is authenticated with Azure AD, so you don’t have to store any credentials in code

 

 

Video Tutorial

 

 

Conclusion

With this Playbook, Security teams can discover the presence of any auto-disabled rules round-the-clock. It provides near real-time visibility via email/team’s notifications. This will be handy to monitor the health of Microsoft Sentinel Analytical rules and avoid any interruptions in discovering threats, anomalous behaviors and remediation processes in your environment from your connected data sources/logs. Try it out, and let us know what you think!

 

Thanks to  @Yuri Diogenes@Cristhofer Munozfor their input into this blog post.

Updated Nov 02, 2021
Version 8.0
  • gernotb Playbook will trigger every 30 min, If you want you can change this schedule in the playbook. If your Analytical rule gets "Auto Disabled", Playbook will send notification.

    To monitor Analytical Rule queries failures, There is a preview for Sentinel health to monitor the health of your Sentinel resources such as Data Connectors, Analytic Rules and more


    KQL Query:
    SentinelHealth
    | where TimeGenerated > ago(30d)
    | where SentinelResourceType == "Analytics Rule"
    | where Status !~"Success"
    | where Status !~"PartialSuccess"
    | where Status !~"Informational"

     

    Pls let me know, if you need any additional information

     

    Thanks

    Sreedhar Ande

  • gernotb's avatar
    gernotb
    Copper Contributor

    Sreedhar_Ande Thank you for the post as we had run into this issue and spent a long time trying to find any supporting log data in Azure Sentinel as to why this happen and how we can monitor this.

    I have setup a test environment to try this playbook in which I have removed access to the Azure Sentinel Workspace used in my Analytical queries but 15 hours later and there is still no sign of any "Auto Disabled" rules.

    I am seeing permission errors when trying to validate the query which matches what is listed under permanent failure conditions.

    Could you let me know what the timeframe or threshold is for it to trigger?

    Is there no other way to see these query failures and changes in any audit log?

     

    Monitoring Analytical Rule queries for failure is a critical part of anyone especially service providers.

     

    Thanks

     

    Gernot