SOLVED

Data Connector - Analytics Rule

Copper Contributor

Hi everyone,

 

I want to have a analytic rule / Automation Rule that everytime that a certain connector (e.g Some Firewall Connector) is down, to receive a Alert in Sentinel.

I've been searching for various alternatives but until now can't find anything that i can put working in my organization.

Anyone as some suggestion, on what you implemented before and that is working right now ? 

 

Thank you.

18 Replies
best response confirmed by miguelfac (Copper Contributor)
Solution

@miguelfac 

 

There are lots of scenarios for this. The most common solution is to monitor for a time delay - so if there is no data in say 15mins then it's probably down. However it could just as easily not have sent any data in that period, so you may have to also check back to the same period the day or week before to see if its uncommon.  You may need different thresholds for each connector/Table - so a watchlist can help.
Anomaly detection can help here as well - look at series_decompose_anomalies(), however in a Rule you are limited to 14days lookback - which isn't often enough to detect seasonal patterns. 
 

If the data is from Syslog /CommonSecurtitylog, you may actually want to monitor the Log collector server(s), using the Heartbeat table, so if for example one server fails out of 4 you still have 75% online capacity - if you just monitored the connector/Table then all 4 have to fail (or not send data).
There are some basic examples in the Queries pane for Heartbeat. 

Clive_Watson_0-1687385522643.png

 



Thank you a lot! I will try the recommended and will get back soon 🙂
Did you manage to make it work?

I did it in different way. I got analytic rule like that:

CommonSecurityLog
|summarize Events = count()
|where Events ==0

Then I have automation rule that is being triggered by this rule. Automation rule triggers playbook that sends Email / SMS 🙂

@Kaaamil Not quite like that, still trying to figure it out..

I'm using this query:

let Now = now();
let queryResult = range TimeGenerated from (Now - 1d) to (Now - 4h) step 4h
| extend Count = 0
| union isfuzzy=true
(CommonSecurityLog
| where DeviceVendor == "connector_name_here"
| summarize Count = count() by bin(TimeGenerated, 8h))
| union (
range x from (Now - 1d) to (Now - 4h) step 8h
| project TimeGenerated = x, Count = 0
)
| summarize Count = max(Count) by bin(TimeGenerated, 8h)
| sort by TimeGenerated
| project Value = iff(isnull(Count), 0, Count), Time = TimeGenerated, Legend = "connector_name_here";
queryResult

 

Trying something like this, and with the alert threshold: is equal to 0

But it isn't working, i have the connector returning me 0 values and it doesn't open a alert

@miguelfac 

 

Try this one - very basic but does the work 🙂 

 

{
    "$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#",
    "contentVersion": "1.0.0.0",
    "parameters": {
        "workspace": {
            "type": "String"
        }
    },
    "resources": [
        {
            "id": "[concat(resourceId('Microsoft.OperationalInsights/workspaces/providers', parameters('workspace'), 'Microsoft.SecurityInsights'),'/alertRules/8c6e05a5-26ad-49ae-9cd6-a3e0f9df305b')]",
            "name": "[concat(parameters('workspace'),'/Microsoft.SecurityInsights/8c6e05a5-26ad-49ae-9cd6-a3e0f9df305b')]",
            "type": "Microsoft.OperationalInsights/workspaces/providers/alertRules",
            "kind": "Scheduled",
            "apiVersion": "2022-11-01-preview",
            "properties": {
                "displayName": "No logs from CommonSecuritylog from last 1 hour",
                "description": "Rule triggers when Sentinel doesn't receive commonsecurity logs",
                "severity": "High",
                "enabled": true,
                "query": "CommonSecurityLog\r\n|summarize Events = count()\r\n|where Events ==0",
                "queryFrequency": "PT1H",
                "queryPeriod": "PT1H",
                "triggerOperator": "GreaterThan",
                "triggerThreshold": 0,
                "suppressionDuration": "PT5H",
                "suppressionEnabled": false,
                "startTimeUtc": null,
                "tactics": [],
                "techniques": [],
                "alertRuleTemplateName": null,
                "incidentConfiguration": {
                    "createIncident": true,
                    "groupingConfiguration": {
                        "enabled": false,
                        "reopenClosedIncident": false,
                        "lookbackDuration": "PT5H",
                        "matchingMethod": "AllEntities",
                        "groupByEntities": [],
                        "groupByAlertDetails": [],
                        "groupByCustomDetails": []
                    }
                },
                "eventGroupingSettings": {
                    "aggregationKind": "SingleAlert"
                },
                "alertDetailsOverride": null,
                "customDetails": null,
                "entityMappings": null,
                "sentinelEntitiesMappings": null,
                "templateVersion": null
            }
        }
    ]
}

 

Sorry for the late response, i have been out for a couple days.
Do i need to put this query in the analytics rule? With the same thresh hold settings?
It's ready template try to import it as a new rule (modify table if it's different) and check if it works.
Im still not getting it 😧 I dont think its working, at least im not receiving anything on sentinel, with that connector returning me 0 values. Or i didn't deploy it correctly?

@miguelfac 

 

You can import json file:
Import and export Microsoft Sentinel analytics rules | Microsoft Learn

Let me explain more this logic. 

I want to check if CommonSecurityLog table doesn't have logs:

 

CommonSecurityLog
|summarize Events = count()
|where Events ==0

 

If query returns no results it means that CommonSecurityLog is not empty for last X amount of time.

Look how many log entries I have for last 30 minutes:

Kaaamil_0-1688714673532.png

 

So lets check if we have 0 logs for last 30 minutes:

Kaaamil_1-1688714718068.png

 

Events==0 is false so it won't be triggered If it'd be true it would mean no logs for last 30 minutes and triggered an incident 🙂 

Alright i just imported it, ill test if it checks if my connector (for example checkpoint one) gets 0 values, and it generates a alert for me.
For this i can just test it by turning off rsyslog off for a few hours.

@miguelfac 

 

I dug this up from when I was a KQL beginner back in 2020. It still works for many of our use cases, though. I made a logging thresh hold because some log sources I would still get heartbeats or something else was "just wrong" with the log source. Alerting on zero logs is easy.

 

I'm not sure if this is elegant or a mess, but it works! 😃

 

let CurrentLog_lookBack = 1h;
let MinimumThresh_lookBack = 1d;
let HistoricalLog_lookBack = 1d;
CommonSecurityLog 
| where DeviceVendor == "YourVendorHere"
//Chage the *.03 to *.06 from the line below to make the AverageHourlyLogThreshold lower than normal for testing.
| summarize Total24HRcount=count(TimeGenerated > ago(HistoricalLog_lookBack)), CurrentHRCount=count(TimeGenerated > ago(CurrentLog_lookBack)), AverageHourlyLogThreshold=count(TimeGenerated > ago(MinimumThresh_lookBack*0.03)) 
| extend Percentofaverage = iif( CurrentHRCount < AverageHourlyLogThreshold, "Logging has dropped below threshold - Check Log Source", "Logging Normal" )
| extend Code = iif( CurrentHRCount < AverageHourlyLogThreshold, "1", "" )
| project CurrentHRCount, Total24HRcount, Percentofaverage, Code, AverageHourlyLogThreshold

 

Change "YourVendorHere" to your vendor in your logs. The "code" is null if logs are above the set thresh hold and 1 if they fall below. You can use the to generate an alert with a playbook or however you like.

 

Normal

JBUB_Accelerynt_0-1689187526592.png

Below Thresh Hold (I didn't have a sample so I just changed the thresh hold for an example)

JBUB_Accelerynt_1-1689187630990.png

 

Here is the ChatGpt explanation of how it works 😃

 

1. `let CurrentLog_lookBack = 1h; let MinimumThresh_lookBack = 1d; let HistoricalLog_lookBack = 1d;`: These are variable declarations. The `let` keyword in KQL allows you to create a variable and assign it a value. `CurrentLog_lookBack` is set to 1 hour, `MinimumThresh_lookBack` is set to 1 day, and `HistoricalLog_lookBack` is also set to 1 day. These are used to set the time frames for the queries.

2. `CommonSecurityLog | where DeviceVendor == "YouDeviceVendor"`: This line is querying logs from the `CommonSecurityLog` data source, specifically filtering to only include logs where the `DeviceVendor` is "YouDeviceVendor".

3. `| summarize Total24HRcount=count(TimeGenerated > ago(HistoricalLog_lookBack)), CurrentHRCount=count(TimeGenerated > ago(CurrentLog_lookBack)), AverageHourlyLogThreshold=count(TimeGenerated > ago(MinimumThresh_lookBack*0.03))`: This line is summarizing the data in a few ways. It's getting a count of the logs in the past 24 hours (`Total24HRcount`), the past hour (`CurrentHRCount`), and the average hourly log threshold (`AverageHourlyLogThreshold`), which is calculated as the count of logs over the past day multiplied by 0.03.

4. `| extend Percentofaverage = iif( CurrentHRCount < AverageHourlyLogThreshold, "Logging has dropped below threshold - Check Log Source", "Logging Normal" )`: This line is creating a new column (`Percentofaverage`) that contains a message about whether the current hour's log count has dropped below the average hourly log threshold. If it has, the message is "Logging has dropped below threshold - Check Log Source"; otherwise, it's "Logging Normal".

5. `| extend Code = iif( CurrentHRCount < AverageHourlyLogThreshold, "1", "" )`: This line is creating another new column (`Code`) that contains "1" if the current hour's log count has dropped below the average hourly log threshold, and an empty string otherwise.

6. `| project CurrentHRCount, Total24HRcount, Percentofaverage, Code, AverageHourlyLogThreshold`: This line is limiting the output of the query to just the columns specified: `CurrentHRCount`, `Total24HRcount`, `Percentofaverage`, `Code`, and `AverageHourlyLogThreshold`.

In summary, the script is checking whether the number of logs from a "YouDeviceVendor" device in the past hour has fallen below a certain threshold (3% of the number of logs in the past day). If it has, a warning message and code are generated. The final output includes the counts of logs in the past hour and day, the threshold, and the warning message and code.

@JBUB_Accelerynt 

 

oh thank you a lot! it looks really nice! And I can just put this code in a Analytic Rule, I will try! I just have to figure it out what is the Rule threshold that i have to set in this analytic, so that it generates a alert in my SIEM. Do you have a idea?

miguelfac_1-1689246265583.png

 

@miguelfac Thanks!

 

Add this additional line to the query.

| where Code == "1"

 

That makes it so it only returns a result if the code is 1, which is when your logs are below the threshold. 

 Then just select "Is Greater than" 0 or "Is Equal to" 1 for your analytic rule.

@JBUB_Accelerynt 

I made this simpler, that old thing was such a mess lol. This does basically the same thing with the same result. If the logs in the past 1 hour fall below 1% of the prior 24 hour window. You can change the percentage from 1% to 5% by changing the 0.01 to 0.05 to fit your needs. Have fun!

 

let averageCount = toscalar(
    CommonSecurityLog
    | where TimeGenerated >= ago(24h)
    | summarize count()
);
CommonSecurityLog
| where TimeGenerated >= ago(1h)
| summarize LogCount = count()
| extend isBelowThreshold = iff(LogCount < averageCount * 0.01, 1, 0)
| where isBelowThreshold == 1

 

I see, but in the new query where is the "yourvendor" placed?
under commonsecuritylog, can i add
| Where DeviceVendor == "devicevendor"

@miguelfac 

Yep! Just make sure you add it to both places.

 

 

let averageCount = toscalar(
    CommonSecurityLog
    | where DeviceVendor == "YourVendor"
    | where TimeGenerated >= ago(24h)
    | summarize count()
);
CommonSecurityLog
| where DeviceVendor == "YourVendor"
| where TimeGenerated >= ago(1h)
| summarize LogCount = count()
| extend isBelowThreshold = iff(LogCount < averageCount * 0.01, 1, 0)
| where isBelowThreshold == 1

 

 

Alright thank you a lot for your inputs! I'll add the analytic rule, as soon as i have more news ill tell you 🙂
Thanks!
1 best response

Accepted Solutions
best response confirmed by miguelfac (Copper Contributor)
Solution

@miguelfac 

 

There are lots of scenarios for this. The most common solution is to monitor for a time delay - so if there is no data in say 15mins then it's probably down. However it could just as easily not have sent any data in that period, so you may have to also check back to the same period the day or week before to see if its uncommon.  You may need different thresholds for each connector/Table - so a watchlist can help.
Anomaly detection can help here as well - look at series_decompose_anomalies(), however in a Rule you are limited to 14days lookback - which isn't often enough to detect seasonal patterns. 
 

If the data is from Syslog /CommonSecurtitylog, you may actually want to monitor the Log collector server(s), using the Heartbeat table, so if for example one server fails out of 4 you still have 75% online capacity - if you just monitored the connector/Table then all 4 have to fail (or not send data).
There are some basic examples in the Queries pane for Heartbeat. 

Clive_Watson_0-1687385522643.png

 



View solution in original post