SOLVED

kql query for distinct values

%3CLINGO-SUB%20id%3D%22lingo-sub-2224298%22%20slang%3D%22en-US%22%3Ekql%20query%20for%20distinct%20values%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-2224298%22%20slang%3D%22en-US%22%3E%3CP%3EHi%20there%2C%3C%2FP%3E%3CP%3EI'm%20trying%20to%20query%20all%20computers%20that%20match%202%20or%20more%20DISTINCT%20DisplayName%20fields.%3C%2FP%3E%3CP%3E%3CSTRONG%3EI%20can%20get%20the%20distinct%20count%3A%3C%2FSTRONG%3E%3C%2FP%3E%3CP%3ESecurityAlert%3CBR%20%2F%3E%7C%20where%20ProductName%20in(%22Microsoft%20Defender%20Advanced%20Threat%20Protection%22)%3CBR%20%2F%3E%7C%20where%20ProviderName%20%3D%3D%20%22MDATP%22%3CBR%20%2F%3E%7C%20mv-expand%20parsejson(Entities)%3CBR%20%2F%3E%7Cextend%20Computer%20%3D%20tostring(Entities.HostName)%3CBR%20%2F%3E%7Csummarize%20dcount(DisplayName)%20by%20Computer%3CBR%20%2F%3E%7Cwhere%20dcount_DisplayName%20%26gt%3B%3D%202%3CBR%20%2F%3E%7Cwhere%20Computer%20%26lt%3B%26gt%3B%20%22%22%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3E%3CSTRONG%3EBut%20I%20want%20a%20table%20that%20lists%20out%20the%20Computer%20AND%20all%20of%20the%20unique%20DisplayName%3C%2FSTRONG%3Es%3CSTRONG%3E%20for%20each%20Computer.%3C%2FSTRONG%3E%3C%2FP%3E%3CP%3Eeg%3A%3CBR%20%2F%3EHost1%20-%20DisplayName1%3C%2FP%3E%3CP%3E%26nbsp%3B%20%26nbsp%3B%20%26nbsp%3B%20%26nbsp%3B%20%26nbsp%3B%20%26nbsp%3B%20%26nbsp%3BDisplayName2%3C%2FP%3E%3CP%3EHost2%20-%20DisplayName1%3C%2FP%3E%3CP%3E%26nbsp%3B%20%26nbsp%3B%20%26nbsp%3B%20%26nbsp%3B%20%26nbsp%3B%20%26nbsp%3B%20%26nbsp%3BDisplayName2%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3E%3CSTRONG%3EIn%20Splunk%20this%20would%20simply%20be%3A%26nbsp%3B%3C%2FSTRONG%3E%3C%2FP%3E%3CP%3E%7C%20stats%20values(DisplayName)%20as%20DisplayName%2C%20dc(DisplayName)%20by%20host%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3EThanks%20for%20your%20thoughts.%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3C%2FLINGO-BODY%3E%3CLINGO-SUB%20id%3D%22lingo-sub-2227421%22%20slang%3D%22en-US%22%3ERe%3A%20kql%20query%20for%20distinct%20values%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-2227421%22%20slang%3D%22en-US%22%3E%3CP%3E%3CA%20href%3D%22https%3A%2F%2Ftechcommunity.microsoft.com%2Ft5%2Fuser%2Fviewprofilepage%2Fuser-id%2F434938%22%20target%3D%22_blank%22%3E%40bobsyouruncle%3C%2FA%3E%26nbsp%3BWhile%20you%20can%20write%20the%20code%20to%20display%20the%20information%20like%20you%20want%20it%20using%20some%20trick%20IF%20commands%2C%20are%20you%20sure%20you%20would%20want%20the%20output%20that%20way.%26nbsp%3B%20%26nbsp%3BIf%20you%20need%20to%20do%20any%20sorting%20the%202nd%20line%20would%20not%20sort%20with%20the%201st%20line%20as%20it%20doesn't%20have%20the%20computer%20name%20in%20it.%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3EIf%20this%20data%20is%20being%20shown%20somewhere%20else%2C%20maybe%20that%20system%20could%20handle%20the%20removing%20of%20the%20host%20name%20as%20needed.%3C%2FP%3E%3C%2FLINGO-BODY%3E%3CLINGO-SUB%20id%3D%22lingo-sub-2227467%22%20slang%3D%22en-US%22%3ERe%3A%20kql%20query%20for%20distinct%20values%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-2227467%22%20slang%3D%22en-US%22%3EHi%20Gary%2C%3CBR%20%2F%3EI'm%20not%20sure%20from%20your%20reply%20what%20you%20don't%20understand.%3CBR%20%2F%3EI%20just%20want%20to%20group%20all%20values%20from%20field%202%20based%20on%20field%201.%3CBR%20%2F%3EAs%20I've%20shown%20this%20is%20a%20no%20brainer%20in%20splunk.%3CBR%20%2F%3EIf%20you%20have%20some%20kql%20examples%20of%20this%20it%20would%20be%20much%20appreciated.%3CBR%20%2F%3EThank%20you.%3C%2FLINGO-BODY%3E%3CLINGO-SUB%20id%3D%22lingo-sub-2229597%22%20slang%3D%22en-US%22%3ERe%3A%20kql%20query%20for%20distinct%20values%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-2229597%22%20slang%3D%22en-US%22%3E%3CP%3E%3CA%20href%3D%22https%3A%2F%2Ftechcommunity.microsoft.com%2Ft5%2Fuser%2Fviewprofilepage%2Fuser-id%2F46875%22%20target%3D%22_blank%22%3E%40Gary%20Bushey%3C%2FA%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EYou%20might%20also%20try%3F%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CPRE%20class%3D%22lia-code-sample%20language-cpp%22%3E%3CCODE%3ESecurityAlert%0A%7C%20where%20ProductName%20in(%22Microsoft%20Defender%20Advanced%20Threat%20Protection%22)%0A%7C%20where%20ProviderName%20%3D%3D%20%22MDATP%22%0A%7C%20mv-expand%20parsejson(Entities)%0A%7C%20extend%20Computer%20%3D%20tostring(Entities.HostName)%0A%7C%20where%20isnotempty(Computer)%0A%7C%20summarize%20dcount(DisplayName)%2C%20make_set(DisplayName)%20by%20Computer%3C%2FCODE%3E%3C%2FPRE%3E%3C%2FLINGO-BODY%3E%3CLINGO-SUB%20id%3D%22lingo-sub-2229691%22%20slang%3D%22en-US%22%3ERe%3A%20kql%20query%20for%20distinct%20values%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-2229691%22%20slang%3D%22en-US%22%3E%3CP%3E%3CA%20href%3D%22https%3A%2F%2Ftechcommunity.microsoft.com%2Ft5%2Fuser%2Fviewprofilepage%2Fuser-id%2F239477%22%20target%3D%22_blank%22%3E%40Clive%20Watson%3C%2FA%3E%26nbsp%3BMuch%20better%20looking%20code%20than%20mine.%26nbsp%3B%20How%20would%20you%20do%20the%20part%20where%20the%20author%20only%20wants%20those%20DisplayNames%20that%20show%20up%20at%20least%20twice%3F%26nbsp%3B%20Is%20it%20just%20a%20matter%20of%20setting%20the%20dcount(DisplayName)%20to%20a%20variable%20and%20then%20checking%20that%20there%20is%20at%20least%202%20after%20that%3F%3C%2FP%3E%3C%2FLINGO-BODY%3E
Contributor

Hi there,

I'm trying to query all computers that match 2 or more DISTINCT DisplayName fields.

I can get the distinct count:

SecurityAlert
| where ProductName in("Microsoft Defender Advanced Threat Protection")
| where ProviderName == "MDATP"
| mv-expand parsejson(Entities)
|extend Computer = tostring(Entities.HostName)
|summarize dcount(DisplayName) by Computer
|where dcount_DisplayName >= 2
|where Computer <> ""

 

But I want a table that lists out the Computer AND all of the unique DisplayNames for each Computer.

eg:
Host1 - DisplayName1

             DisplayName2

Host2 - DisplayName1

             DisplayName2

 

In Splunk this would simply be: 

| stats values(DisplayName) as DisplayName, dc(DisplayName) by host

 

Thanks for your thoughts.

 

 

8 Replies

@bobsyouruncle While you can write the code to display the information like you want it using some trick IF commands, are you sure you would want the output that way.   If you need to do any sorting the 2nd line would not sort with the 1st line as it doesn't have the computer name in it.

 

If this data is being shown somewhere else, maybe that system could handle the removing of the host name as needed.

Hi Gary,
I'm not sure from your reply what you don't understand.
I just want to group all values from field 2 based on field 1.
As I've shown this is a no brainer in splunk.
If you have some kql examples of this it would be much appreciated.
Thank you.

@bobsyouruncle So do you care if Hist shows in Rows 1 and 2?  If that is not an issue then after you get your host and your displayName, you can concatenate (using the strcat command) and then perform another distinct on the concatenated string.

 

SecurityAlert
| where ProductName in("Microsoft Defender Advanced Threat Protection")
| where ProviderName == "MDATP"
| mv-expand parsejson(Entities)
|extend Computer = tostring(Entities.HostName)

|where Computer <> ""
|summarize dcount(DisplayName) by Computer
|where dcount_DisplayName >= 2

| extend hostdisplay = strcat(Computer," - ",DisplayName)

| distinct hostdisplay

 

Hope this is what you are looking for.

best response confirmed by bobsyouruncle (Contributor)
Solution

@Gary Bushey 

You might also try?

 

SecurityAlert
| where ProductName in("Microsoft Defender Advanced Threat Protection")
| where ProviderName == "MDATP"
| mv-expand parsejson(Entities)
| extend Computer = tostring(Entities.HostName)
| where isnotempty(Computer)
| summarize dcount(DisplayName), make_set(DisplayName) by Computer

@Clive Watson Much better looking code than mine.  How would you do the part where the author only wants those DisplayNames that show up at least twice?  Is it just a matter of setting the dcount(DisplayName) to a variable and then checking that there is at least 2 after that?

@Clive Watson , @Gary Bushey  - THANK YOU!:smile:

This is incredibly helpful to me for detecting attackers who have used a variety of exploits on a single host.
I see this pattern all the time on waf, ids, endpoint and it's almost always something interesting.
I just have to change the threshold of dcount(DisplayName) to whatever number I like (usually 3 or higher).
If you have more 'threat' detection type queries I'd LOVE to see them.

@bobsyouruncle 

 

You could maybe add some anomaly detection as well?

// https://docs.microsoft.com/en-us/azure/data-explorer/anomaly-detection#time-series-anomaly-detection
// Anomaly scores above 1.5 or below -1.5 indicate a mild anomaly rise or decline respectively. 
// Anomaly scores above 3.0 or below -3.0 indicate a strong anomaly.
SecurityAlert
| where ProductName in("Microsoft Defender Advanced Threat Protection")
| where ProviderName == "MDATP"
| make-series Trend = count() on TimeGenerated from startofday(ago(90d)) to startofday(ago(0d)) step 1d by DisplayName
| extend (anomalies, score, baseline) = series_decompose_anomalies(Trend, 1.5, -1, 'linefit', 1, 'ctukey', 0.6)
| extend expectedEventCounts=baseline, actualEventCount=Trend, Score = score[-1]
| where Score > 1.5 or Score < -1.5 

  Just comment out the last line or alter it to show what ever anomaly level your are happy with - this will probably needs some tweaking for your use.   
These type of queries, display very nicely in a Azure Workbook (taken from my Workspace Usage report, in the Azure Sentinel Workbooks blade and Github) Screenshot 2021-03-23 171117.jpg

I like it very much, thanks @clive!
I wish we had a channel just for showing hundreds of kql -> viz/output
Very educational.