Hi AlphaBetaGamma thanks for raising this query. Please see below explanations. I will also make an update to the blog later on to include them in the main body for the benefit of others:
let UpperThreshold = 3 ; //+ or -3 is the suggested number and it indicates a strong anomaly though you can modify it : Outlier - Wikipedia
let LowerThreshold = -3 ;
let TableIgnoreList = dynamic(['SecurityAlert', 'BehaviorAnalytics', 'SecurityBaseline', 'ProtectionStatus']); // select tables you want to EXCLUDE from the results
union withsource=TableName1 *
| make-series count() on TimeGenerated from ago(14d) to now() step 1d by TableName1 //create array based on ingested data from all tables in the workspace across 14 days on a daily basis
| extend (anomalies, score, baseline) = series_decompose_anomalies(count_, 1.5, 7, 'linefit', 1, 'ctukey', 0.01) //take the array of ingested data across the 14 days and extract anomalous points with scores based on predicted values using the linear regression concept. See https://docs.microsoft.com/en-us/azure/data-explorer/kusto/query/series-decompose-anomaliesfunction for a detailed explanation of each argument. For an explanation of 'ctukey' read: Outlier - Wikipedia.
in our case the seasonality argument (7) can be left at default i.e. -1 and the output would still be the same
| where anomalies[-1] == 1 or anomalies[-1] == -1 //the output of series_decompose_anomalies function is three things: A ternary (as opposed to binary) series containing (+1, -1, 0) marking up/down/no anomaly respectively, the Anomaly score and the predicted value or baseline.
| extend Score = score[-1] //this picks up the anomaly state from the most recent run. -1 indicates a position in the array. To see exactly how it works I suggest you run parts of the query and look at the output. i.e. up to line 9. In my case one of the runs produced the below output
TableName1 |
anomalies |
score |
Score |
AzureMetrics |
[0,0,0,0,0,0,0,0,0,0,0,0,0,-1] |
[0,0,0,0,0,0,0,0,0,0,0,0,0,-3.9998356305150646] |
-3.99984 |
Syslog |
[0,0,0,0,0,0,0,0,0,0,0,0,0,-1] |
[-0.00036792965707559,0,0,0,0,0,0,1.3549523289867058,0.09720032576946343,0,0,-0.7998683232928704,0,-4.7250625572418405] |
-4.72506 |
AzureActivity |
[0,0,0,0,0,-1,0,0,0,0,0,0,0,1] |
[0.255883471606481,0,0,0,0,-1.8238931029256342,0,-0.21675871267353206,0,0,0,0,1.3739583751967412,4.334262181116097] |
4.334262 |
Operation |
[0,1,0,0,0,0,0,0,-1,0,0,0,0,1] |
[-0.17908658774875946,2.248291860653117,0,0,0,0,0,0,-2.6172533848101915,0,0,0,0.12298717471902801,8.68893601041275] |
8.688936 |
UserAccessAnalytics |
[0,0,0,0,0,1,0,0,0,0,0,0,-1,-1] |
[-0.051628790842371826,0,0,0,0,5.2235777977951106,0,0,0,0,0,0.06069871355792385,-3.710993763386163,-10.209942232260403] |
-10.2099 |
Note: I filtered the columns above so as to be able to display the two score columns and anomalies in this limited space
the -1 is the position in the array counting from the extreme right. The result there will be either 1, 0 or -1
| where Score >= UpperThreshold or Score <= LowerThreshold | where TableName1 !in (TableIgnoreList) //compare with strong anomaly indicator values
| project TableName=TableName1, ExpectedCount=round(todouble(baseline[-1]),1), ActualCount=round(todouble(count_[-1]),1), AnomalyScore = round(todouble(score[-1]),1)