SOLVED

How to Prevent Duplicate Incidents from Being Generated due to Long Data Look Back

Copper Contributor

Hey everyone,

We are facing an issue with regards to our rules on Sentinel and that is when we create a rule and, in its logic, we configure the query to lookup data from longer times, say the last 14 days, this rule is going to get triggered whenever when it sees the same event during that 14 days again and again whenever the query runs, and it is going to create the same incident (with different ID).
For example, the event X has happened today. The query detects it, and the rule generates an incident for it. We then analyse and finally close this incident. If our query runs for example every 2 hours, on the next run, since the rules lookup data from the past 14 days, it again sees the event X and it is going to create another incident with the same attributes for it only with a different incident ID. And the alert grouping does not work here since doesn't work on closed alerts. Since we need the rule to lookup the past 14 days, is there any way to prevent the creation of the same incidents on each query run for the same events?

Thank you so much in advance for your kind help.

5 Replies

@Browtastic670 

One suggestion would be to add those entities to a watchlist and just exclude watchlist entries in your query.

And you could add a timestamp column to the watchlist to filter out entries that are x days old.

 

 

This is a classic look back query. What you will need to do is split the query, part 1 is to look at the past 2hrs or whatever the rule frequency is set to. Then if that gets a result compare to 14days minus the look back period...which will remove the duplicates
Hey Clive!

Thank you so much for your response.

The method that you are proposing seems very interesting (and a bit confusing) to us. How exactly do we go about implementing it?
For reference we have our original query below:
TheatIntel Table
| summarize by indicator_s
| join kind=rightsemi (
Web_Traffic_Table
| summarize by User=Email_s,Action_s,DestinationIP_s,HTTPMethod_s,HTTPHost_s,URL_s,Referer_s,UserAgent_s,DownloadedFileNames_s,UploadedFileNames_s
) on $left.indicator_s==$right.HTTPHost_s

Here what we are trying to do is correlate any matching Threat Intel Feeds with the Web Browsing Logs of our users.

Another solution that stems from @SocInABox's comment is that we can add all the Threat Intel Feeds into a Watchlist instead of a table and that way we won't even have to worry about the time range.

But the issue of duplicate alerts will still remain 😕
best response confirmed by Browtastic670 (Copper Contributor)
Solution

@Browtastic670 

 

Step 1:
Let's do a simple query over the past 2hrs to find distinct computers that start with "S" in their name

 

 

Heartbeat
| where TimeGenerated between (now(-2h) .. now())
//| summarize min(TimeGenerated), max(TimeGenerated)
| where Computer startswith "S"
| distinct Computer

 


This returns two computers that match:
SQL12.na.contosohotels.com
SQL00.na.contosohotels.com


Step 2:
Now we do similar for the past 14days, minus the most recent 2hrs
 

 

    Heartbeat
    | where TimeGenerated between (ago(14d) .. ago(2h))
    //| summarize min(TimeGenerated), max(TimeGenerated)
    | where Computer startswith "S"
    | distinct Computer

 

 This returns 3 computers:

SQL12.na.contosohotels.com
SQL00.na.contosohotels.com
SQL01.na.contosohotels.com

Step 3 (join it all together, in one query) 

 

//
// Look back in the past 2 hours for a list of distinct Computers - 
// I only wanted to show named ones that start with an "S" to keep it simple
//
Heartbeat
| where TimeGenerated between (now(-2h) .. now())
//| summarize min(TimeGenerated), max(TimeGenerated)
| where Computer startswith "S"
| distinct Computer
//
// Now do the same for the past 14days minus the last 2 hours - this is key so we dont process the same data!!! 
//
| join kind=rightanti
 (
    Heartbeat
    | where TimeGenerated between (ago(14d) .. ago(2h))
    //| summarize min(TimeGenerated), max(TimeGenerated)
    | where Computer startswith "S"
    | distinct Computer
 ) on Computer​

 

This returns the single computer that is only found in the last 2hrs but not in the previous 14days, usig a JOIN and rightanti :

 

SQL01.na.contosohotels.com

 


Click here to see this in action (it works at the time of writing but as the hours change, the filter I used on "S" may mean it won't be a good demo). 

Go to Log Analytics and run query



An alternative would be a similar query but lookup the found email or IP in the past two hours in the Incident/Alert in the SecurityIncident/SecurityAlert tables 

1 best response

Accepted Solutions
best response confirmed by Browtastic670 (Copper Contributor)
Solution

@Browtastic670 

 

Step 1:
Let's do a simple query over the past 2hrs to find distinct computers that start with "S" in their name

 

 

Heartbeat
| where TimeGenerated between (now(-2h) .. now())
//| summarize min(TimeGenerated), max(TimeGenerated)
| where Computer startswith "S"
| distinct Computer

 


This returns two computers that match:
SQL12.na.contosohotels.com
SQL00.na.contosohotels.com


Step 2:
Now we do similar for the past 14days, minus the most recent 2hrs
 

 

    Heartbeat
    | where TimeGenerated between (ago(14d) .. ago(2h))
    //| summarize min(TimeGenerated), max(TimeGenerated)
    | where Computer startswith "S"
    | distinct Computer

 

 This returns 3 computers:

SQL12.na.contosohotels.com
SQL00.na.contosohotels.com
SQL01.na.contosohotels.com

Step 3 (join it all together, in one query) 

 

//
// Look back in the past 2 hours for a list of distinct Computers - 
// I only wanted to show named ones that start with an "S" to keep it simple
//
Heartbeat
| where TimeGenerated between (now(-2h) .. now())
//| summarize min(TimeGenerated), max(TimeGenerated)
| where Computer startswith "S"
| distinct Computer
//
// Now do the same for the past 14days minus the last 2 hours - this is key so we dont process the same data!!! 
//
| join kind=rightanti
 (
    Heartbeat
    | where TimeGenerated between (ago(14d) .. ago(2h))
    //| summarize min(TimeGenerated), max(TimeGenerated)
    | where Computer startswith "S"
    | distinct Computer
 ) on Computer​

 

This returns the single computer that is only found in the last 2hrs but not in the previous 14days, usig a JOIN and rightanti :

 

SQL01.na.contosohotels.com

 


Click here to see this in action (it works at the time of writing but as the hours change, the filter I used on "S" may mean it won't be a good demo). 

Go to Log Analytics and run query



An alternative would be a similar query but lookup the found email or IP in the past two hours in the Incident/Alert in the SecurityIncident/SecurityAlert tables 

View solution in original post