Forum Discussion

davinelm's avatar
davinelm
Copper Contributor
Aug 08, 2022

Quantity of bad emails vs legit email query

Hello, I'm trying to create a kql hunting query that will display the total number of bad emails (malware, spam, phishing) but am having some trouble combining the columns of various bad emails together in one and getting a total count. The screen grab below shows the first 6 columns that i want to collapse into.

 


my query in total is this so far:

let vip = pack_array("email address removed for privacy reasons");
EmailEvents
| where RecipientEmailAddress has_any (vip)
| project RecipientEmailAddress, ThreatTypes
| evaluate pivot(ThreatTypes)
 
  • Ok, I think I understand the ask now. When you get a chance try the below query. If you don't care about clean emails you can comment \ uncomment the lines noted to make it more performant. Let me know if this works for you.

    let vip = pack_array("email address removed for privacy reasons");
    EmailEvents
    | where RecipientEmailAddress in~ (vip)
    //| where isnotempty( ThreatTypes) // Uncomment this line if you don't care about clean email count
    | summarize
    Phish = countif(ThreatTypes has 'Phish')
    , Spam = countif(ThreatTypes has 'Spam')
    , Malware = countif(ThreatTypes has 'Malware')
    , Clean = countif(isempty(ThreatTypes)) // You can comment this line if you don't care about clean email count
    by RecipientEmailAddress
  • Hello davinelm,
    While this sounds like it should be a simple query its actually quite complex :). Technically you're representing multiple dimensions in a single table which is great from a reporting perspective, but a bit awkward from a data perspective. To accomplish this you need to union together two tables:
    - one that is a pivot of the e-mails per recipient and threat type
    - one that is a sum of the e-mails per threat type

    If you simply summarize the dataset you will get one row per column name which is a different schema from your format of counts per user and data type. Essentially we need to flip one of these tables on its side.

    The query below unions together your pivot query above with a summary per column. To flip it I pack the rows up as JSON objects, make them a property bag, then mv-expand them to turn it back into a table. Let me know if this accomplishes what you hope!

    let PerUserSummary = (
    EmailEvents
    | where RecipientEmailAddress has_any (vip)
    | project RecipientEmailAddress, ThreatTypes
    );
    union (
    PerUserSummary
    | evaluate pivot(ThreatTypes)
    ), (
    PerUserSummary
    | extend ThreatTypes = iff(isempty(ThreatTypes), '_Empty', ThreatTypes)
    | summarize count() by ThreatTypes
    | project ThreatTypes, count_
    | where isnotempty( ThreatTypes)
    | project packed = pack(ThreatTypes, count_)
    | summarize make_bag(packed)
    | evaluate bag_unpack(bag_packed)
    | extend RecipientEmailAddress = 'Total'
    )
    • davinelm's avatar
      davinelm
      Copper Contributor

      MichaelJMelone 

       

      Thank you for your reply! I tried out this query and it returned the same result from my first query minus one row. From messing around with it some more it is turning out to be a bit of a mind bender (at least for a kql noob as myself).


      so just to clarify if i have columns Red, Blue, Yellow and Red and Blue and Yellow, with 1 count in each column, how to combine them in a new column called "Colors" with a count of all the 1s from each of the individual columns? As of now it works to massage the data in excel and open it in powerbi but eventually i'd like to automate this for reporting. 

      Thanks again for your reply.

       

      • MichaelJMelone's avatar
        MichaelJMelone
        Icon for Microsoft rankMicrosoft
        Ok, I think I understand the ask now. When you get a chance try the below query. If you don't care about clean emails you can comment \ uncomment the lines noted to make it more performant. Let me know if this works for you.

        let vip = pack_array("email address removed for privacy reasons");
        EmailEvents
        | where RecipientEmailAddress in~ (vip)
        //| where isnotempty( ThreatTypes) // Uncomment this line if you don't care about clean email count
        | summarize
        Phish = countif(ThreatTypes has 'Phish')
        , Spam = countif(ThreatTypes has 'Spam')
        , Malware = countif(ThreatTypes has 'Malware')
        , Clean = countif(isempty(ThreatTypes)) // You can comment this line if you don't care about clean email count
        by RecipientEmailAddress

Resources