Forum Discussion
Browtastic670
Oct 27, 2022Copper Contributor
How to Prevent Duplicate Incidents from Being Generated due to Long Data Look Back
Hey everyone,
We are facing an issue with regards to our rules on Sentinel and that is when we create a rule and, in its logic, we configure the query to lookup data from longer times, say the last 14 days, this rule is going to get triggered whenever when it sees the same event during that 14 days again and again whenever the query runs, and it is going to create the same incident (with different ID).
For example, the event X has happened today. The query detects it, and the rule generates an incident for it. We then analyse and finally close this incident. If our query runs for example every 2 hours, on the next run, since the rules lookup data from the past 14 days, it again sees the event X and it is going to create another incident with the same attributes for it only with a different incident ID. And the alert grouping does not work here since doesn't work on closed alerts. Since we need the rule to lookup the past 14 days, is there any way to prevent the creation of the same incidents on each query run for the same events?
Thank you so much in advance for your kind help.
Step 1:
Let's do a simple query over the past 2hrs to find distinct computers that start with "S" in their nameHeartbeat | where TimeGenerated between (now(-2h) .. now()) //| summarize min(TimeGenerated), max(TimeGenerated) | where Computer startswith "S" | distinct Computer
This returns two computers that match:
SQL12.na.contosohotels.comSQL00.na.contosohotels.comStep 2:Now we do similar for the past 14days, minus the most recent 2hrsHeartbeat | where TimeGenerated between (ago(14d) .. ago(2h)) //| summarize min(TimeGenerated), max(TimeGenerated) | where Computer startswith "S" | distinct Computer
This returns 3 computers:SQL12.na.contosohotels.comSQL00.na.contosohotels.comSQL01.na.contosohotels.com
Step 3 (join it all together, in one query)// // Look back in the past 2 hours for a list of distinct Computers - // I only wanted to show named ones that start with an "S" to keep it simple // Heartbeat | where TimeGenerated between (now(-2h) .. now()) //| summarize min(TimeGenerated), max(TimeGenerated) | where Computer startswith "S" | distinct Computer // // Now do the same for the past 14days minus the last 2 hours - this is key so we dont process the same data!!! // | join kind=rightanti ( Heartbeat | where TimeGenerated between (ago(14d) .. ago(2h)) //| summarize min(TimeGenerated), max(TimeGenerated) | where Computer startswith "S" | distinct Computer ) on Computer​
This returns the single computer that is only found in the last 2hrs but not in the previous 14days, usig a JOIN and rightanti :
SQL01.na.contosohotels.com
Click here to see this in action (it works at the time of writing but as the hours change, the filter I used on "S" may mean it won't be a good demo).Go to Log Analytics and run query
An alternative would be a similar query but lookup the found email or IP in the past two hours in the Incident/Alert in the SecurityIncident/SecurityAlert tables
- SocInABoxIron Contributor
One suggestion would be to add those entities to a watchlist and just exclude watchlist entries in your query.
And you could add a timestamp column to the watchlist to filter out entries that are x days old.
- Clive_WatsonBronze ContributorThis is a classic look back query. What you will need to do is split the query, part 1 is to look at the past 2hrs or whatever the rule frequency is set to. Then if that gets a result compare to 14days minus the look back period...which will remove the duplicates
- Browtastic670Copper ContributorHey Clive!
Thank you so much for your response.
The method that you are proposing seems very interesting (and a bit confusing) to us. How exactly do we go about implementing it?
For reference we have our original query below:
TheatIntel Table
| summarize by indicator_s
| join kind=rightsemi (
Web_Traffic_Table
| summarize by User=Email_s,Action_s,DestinationIP_s,HTTPMethod_s,HTTPHost_s,URL_s,Referer_s,UserAgent_s,DownloadedFileNames_s,UploadedFileNames_s
) on $left.indicator_s==$right.HTTPHost_s
Here what we are trying to do is correlate any matching Threat Intel Feeds with the Web Browsing Logs of our users.
Another solution that stems from SocInABox's comment is that we can add all the Threat Intel Feeds into a Watchlist instead of a table and that way we won't even have to worry about the time range.
But the issue of duplicate alerts will still remain 😕