Writing alert rules using KQL is powerful but does not have to be complex. A good example would be rules which in traditional SIEM use Active Lists (or Reference Sets, depending on your SIEM). When presenting KQL rules, I am often asked how to implement Active Lists in Sentinel. While replicating Active Lists in Sentinel is a good topic for another blog post, I will focus here on avoiding Active Lists in the first place using Sentinel query-based rules. You just don't need them in most cases.
This post of part of a series of blog posts on writing rules in Azure Sentinel:
The "Failed AzureAD logons but success logon to AWS Console" rule
To avoid a theoretical discussion, let's focus on a specific example. I picked this rule, which @petebrayne from our research team wrote as it is a powerful use case yet simple enough for demonstration purposes. The rule is also built-in into Azure Sentinel.
The rule looks for:
An IP address from which there were many failed logins to AAD, presumably a brute force attempt.
A successful login to AWS console from the same IP, implying a potential breach.
Traditional Active List implementation
In a traditional SIEM, this use case would usually require two rules and an Active list (see here an ArcSight implementation of a similar Brute Force rule or the official ArcSight guide.
The first identifies failed AAD logins and updates the count of failed logins for an IP in an Active List.
The second will identifies a successful AWS console login and check if the IP address appears in the Active List and the count is above a threshold.
This approach works, but it is far from trivial and is hard to maintain. It requires two unrelated rules and a separate Active List object. The actual time frames for the use case are actually the Active List record TTL rather than any rule property. Make the scenarios slightly more complex - by adding a 3rd event condition - and it becomes unmanageable.
Azure Sentinel make_list()
Let's look at Pete's rule. The first part takes full advantage of the fact that query-based rules can look back to create the list on demand without requiring a second rule and an intermediate object:
let signin_threshold = 5; let suspicious_signins = SigninLogs | where TimeGenerated >= ago(1d) | where ResultType !in ("0", "50125", "50140") | where IPAddress != "127.0.0.1" | summarize count() by IPAddress | where count_ > signin_threshold | summarize make_list(IPAddress);
What is this section doing?
The initial "where" closes filter out the relevant failure events. This would be similar to an ArcSight filter. Note that the time frame for the failed events is conveniently included in this clause using TimeGenerated >= ago(1d).
The first "summarize" statement counts the number of failed logins for each IP address, and the following "where" clause selects only those above a threshold. Note that the threshold can be dynamic, implementing behavioral analytics using the technique described here. A fixed threshold is often just as good and certainly easier for learning.
The second "summarize" creates the list. Using two summarize statements is an optimization that shaves execution time as the second "summarize" is more costly but will be used only on above threshold IP addresses.
Pretty straight forward, isn't it? We just created a list!
Wrapping up the rule
The second part of the rule only has to identify AWS login failures (Blue) and then check if the IP address is in the list from step 1 (Magenta). The rest (Green) is only used to enhance the information for the analyst use.
AWSCloudTrail | where TimeGenerated > ago(1d) | where EventName == "ConsoleLogin" | extend LoginResult = tostring(parse_json(ResponseElements).ConsoleLogin) | where LoginResult == "Success" | where SourceIpAddress in (suspicious_signins) | extend Reason = "Multiple failed AAD logins from IP address" | extend MFAUsed = tostring(parse_json(AdditionalEventData).MFAUsed) | extend User = iif(UserIdentityUserName == "", UserIdentityType, UserIdentityUserName) | project TimeGenerated, Reason, LoginResult, EventTypeName, UserIdentityType, User, AWSRegion, SourceIpAddress, UserAgent, MFAUsed | extend timestamp = TimeGenerated, AccountCustomEntity = User, IPCustomEntity = SourceIpAddress
Simple. Isn't it? Since the Azure Sentinel rule does not depend on a state machine, it is easier to build, test, and maintain.
Next time we will discuss the other use of Active Lists: lookups.