Sep 13 2020 10:50 PM
Hi - I would like to detect anomalies across multiple fields that are not numeric (e.g. looking for unusual azure ad sign-in events using source IP, app name, account name, client name). To the best of my reading, Sentinel/kusto has time series analytic capabilities and can easily detect anomalies - however only on one continuous numeric field.
What I'm looking for is a way to perform anomaly detection when the event data is categorical (IP addresses, account names), rather than numeric. Splunk has a really convenient "anomalydetection" function that takes a list of fields, then computes the probability of each combination of fields in the source data, and filters to only the most unlikely events. This is exactly what I am after, but can't figure out how to do it in Sentinel. Any pointers / guides?
Sep 17 2020 12:02 PM
Sep 17 2020 03:35 PM
Sep 19 2020 12:09 PM
@mrboxx you can create a baseline data and compare the last 1d of data with your baseline by using join. There are several ways to accomplish this. The below is an example:
// Logic: create a baseline by using data from 15 days ago until 1 day ago.
// compare the last 1d of data with the baseline
let startdate=15d;
let enddate=1d;
let baseline = materialize ( SigninLogs
| where TimeGenerated between ( ago(startdate) .. ago(enddate))
| where OperationName == "Sign-in activity"
| extend countryOrRegion_ = tostring(LocationDetails.countryOrRegion)
//| summarize Country_=make_set(countryOrRegion_) by Identity, bin(TimeGenerated, 1d)
| summarize max(TimeGenerated) by Identity, countryOrRegion_, bin(TimeGenerated,1d)
);
let countries_by_identity = baseline
| summarize previous_countries=make_set(countryOrRegion_) by Identity;
let existing_users = baseline
| summarize make_list(Identity);
SigninLogs
| where TimeGenerated > ago(1d)
| where OperationName == "Sign-in activity"
| where Identity in~ (existing_users) // to remove the false positive where an identity is first seen.
| extend countryOrRegion_ = tostring(LocationDetails.countryOrRegion)
| summarize LastSigninActivity=max(TimeGenerated) by Identity, countryOrRegion_
| join kind=leftanti baseline on Identity, countryOrRegion_
| join kind=inner countries_by_identity on Identity
| project-away Identity1