Hi everyone, I want to have a analytic rule / Automation Rule that everytime that a certain connector (e.g Some Firewall Connector) is down, to receive a Alert in Sentinel.I've been searching for various alternatives but until now can't find anything that i can put working in my organization.Anyone as some suggestion, on what you implemented before and that is working right now ? Thank you.

miguelfac There are lots of scenarios for this. The most common solution is to monitor for a time delay - so if there is no data in say 15mins then it's probably down. However it could just as easily not have sent any data in that period, so you may have to also check back to the same period the day or week before to see if its uncommon. You may need different thresholds for each connector/Table - so a watchlist can help.Anomaly detection can help here as well - look at series_decompose_anomalies(), however in a Rule you are limited to 14days lookback - which isn't often enough to detect seasonal patterns. If the data is from Syslog /CommonSecurtitylog, you may actually want to monitor the Log collector server(s), using the Heartbeat table, so if for example one server fails out of 4 you still have 75% online capacity - if you just monitored the connector/Table then all 4 have to fail (or not send data).There are some basic examples in the Queries pane for Heartbeat.

Data Connector - Analytics Rule | Microsoft Community Hub

18 Replies

JBUB_Accelerynt
Brass Contributor
Jul 12, 2023
miguelfac

I dug this up from when I was a KQL beginner back in 2020. It still works for many of our use cases, though. I made a logging thresh hold because some log sources I would still get heartbeats or something else was "just wrong" with the log source. Alerting on zero logs is easy.

I'm not sure if this is elegant or a mess, but it works! 😃

let CurrentLog_lookBack = 1h; let MinimumThresh_lookBack = 1d; let HistoricalLog_lookBack = 1d; CommonSecurityLog | where DeviceVendor == "YourVendorHere" //Chage the *.03 to *.06 from the line below to make the AverageHourlyLogThreshold lower than normal for testing. | summarize Total24HRcount=count(TimeGenerated > ago(HistoricalLog_lookBack)), CurrentHRCount=count(TimeGenerated > ago(CurrentLog_lookBack)), AverageHourlyLogThreshold=count(TimeGenerated > ago(MinimumThresh_lookBack*0.03)) | extend Percentofaverage = iif( CurrentHRCount < AverageHourlyLogThreshold, "Logging has dropped below threshold - Check Log Source", "Logging Normal" ) | extend Code = iif( CurrentHRCount < AverageHourlyLogThreshold, "1", "" ) | project CurrentHRCount, Total24HRcount, Percentofaverage, Code, AverageHourlyLogThreshold

Change "YourVendorHere" to your vendor in your logs. The "code" is null if logs are above the set thresh hold and 1 if they fall below. You can use the to generate an alert with a playbook or however you like.

Normal
Below Thresh Hold (I didn't have a sample so I just changed the thresh hold for an example)

Here is the ChatGpt explanation of how it works 😃

1. `let CurrentLog_lookBack = 1h; let MinimumThresh_lookBack = 1d; let HistoricalLog_lookBack = 1d;`: These are variable declarations. The `let` keyword in KQL allows you to create a variable and assign it a value. `CurrentLog_lookBack` is set to 1 hour, `MinimumThresh_lookBack` is set to 1 day, and `HistoricalLog_lookBack` is also set to 1 day. These are used to set the time frames for the queries.
2. `CommonSecurityLog | where DeviceVendor == "YouDeviceVendor"`: This line is querying logs from the `CommonSecurityLog` data source, specifically filtering to only include logs where the `DeviceVendor` is "YouDeviceVendor".
3. `| summarize Total24HRcount=count(TimeGenerated > ago(HistoricalLog_lookBack)), CurrentHRCount=count(TimeGenerated > ago(CurrentLog_lookBack)), AverageHourlyLogThreshold=count(TimeGenerated > ago(MinimumThresh_lookBack*0.03))`: This line is summarizing the data in a few ways. It's getting a count of the logs in the past 24 hours (`Total24HRcount`), the past hour (`CurrentHRCount`), and the average hourly log threshold (`AverageHourlyLogThreshold`), which is calculated as the count of logs over the past day multiplied by 0.03.
4. `| extend Percentofaverage = iif( CurrentHRCount < AverageHourlyLogThreshold, "Logging has dropped below threshold - Check Log Source", "Logging Normal" )`: This line is creating a new column (`Percentofaverage`) that contains a message about whether the current hour's log count has dropped below the average hourly log threshold. If it has, the message is "Logging has dropped below threshold - Check Log Source"; otherwise, it's "Logging Normal".
5. `| extend Code = iif( CurrentHRCount < AverageHourlyLogThreshold, "1", "" )`: This line is creating another new column (`Code`) that contains "1" if the current hour's log count has dropped below the average hourly log threshold, and an empty string otherwise.
6. `| project CurrentHRCount, Total24HRcount, Percentofaverage, Code, AverageHourlyLogThreshold`: This line is limiting the output of the query to just the columns specified: `CurrentHRCount`, `Total24HRcount`, `Percentofaverage`, `Code`, and `AverageHourlyLogThreshold`.
In summary, the script is checking whether the number of logs from a "YouDeviceVendor" device in the past hour has fallen below a certain threshold (3% of the number of logs in the past day). If it has, a warning message and code are generated. The final output includes the counts of logs in the past hour and day, the threshold, and the warning message and code.
- JBUB_Accelerynt
  Brass Contributor
  Jul 13, 2023
  JBUB_Accelerynt
  I made this simpler, that old thing was such a mess lol. This does basically the same thing with the same result. If the logs in the past 1 hour fall below 1% of the prior 24 hour window. You can change the percentage from 1% to 5% by changing the 0.01 to 0.05 to fit your needs. Have fun!
  
  let averageCount = toscalar( CommonSecurityLog | where TimeGenerated >= ago(24h) | summarize count() ); CommonSecurityLog | where TimeGenerated >= ago(1h) | summarize LogCount = count() | extend isBelowThreshold = iff(LogCount < averageCount * 0.01, 1, 0) | where isBelowThreshold == 1
  - miguelfac
    Copper Contributor
    Jul 14, 2023
    I see, but in the new query where is the "yourvendor" placed?
- miguelfac
  Copper Contributor
  Jul 13, 2023
  JBUB_Accelerynt
  
  oh thank you a lot! it looks really nice! And I can just put this code in a Analytic Rule, I will try! I just have to figure it out what is the Rule threshold that i have to set in this analytic, so that it generates a alert in my SIEM. Do you have a idea?
  - JBUB_Accelerynt
    Brass Contributor
    Jul 13, 2023
    miguelfac Thanks!
    
    Add this additional line to the query.
    | where Code == "1"
    
    That makes it so it only returns a result if the code is 1, which is when your logs are below the threshold.
    Then just select "Is Greater than" 0 or "Is Equal to" 1 for your analytic rule.
Clive_Watson
Bronze Contributor
Jun 21, 2023
miguelfac

There are lots of scenarios for this. The most common solution is to monitor for a time delay - so if there is no data in say 15mins then it's probably down. However it could just as easily not have sent any data in that period, so you may have to also check back to the same period the day or week before to see if its uncommon. You may need different thresholds for each connector/Table - so a watchlist can help.
Anomaly detection can help here as well - look at series_decompose_anomalies(), however in a Rule you are limited to 14days lookback - which isn't often enough to detect seasonal patterns.

If the data is from Syslog /CommonSecurtitylog, you may actually want to monitor the Log collector server(s), using the Heartbeat table, so if for example one server fails out of 4 you still have 75% online capacity - if you just monitored the connector/Table then all 4 have to fail (or not send data).
There are some basic examples in the Queries pane for Heartbeat.
- miguelfac
  Copper Contributor
  Jun 27, 2023
  Thank you a lot! I will try the recommended and will get back soon 🙂
  - Kaaamil
    Copper Contributor
    Jun 27, 2023
    Did you manage to make it work?
    
    I did it in different way. I got analytic rule like that:
    
    CommonSecurityLog
    |summarize Events = count()
    |where Events ==0
    
    Then I have automation rule that is being triggered by this rule. Automation rule triggers playbook that sends Email / SMS 🙂

Forum Discussion

Data Connector - Analytics Rule

18 Replies

Resources