Blog Post

Microsoft Sentinel Blog
6 MIN READ

Using KQL functions to speed up analysis in Azure Sentinel

Pete Bryan's avatar
Pete Bryan
Icon for Microsoft rankMicrosoft
Jun 25, 2019

Security operations can often be a very repetitive role. As a security analyst, you will often find yourself conducting the same actions and tasks as you work through an investigation. Whilst it is a good objective to automate these tasks as much as possible, it is often not suitable or practical to fully automate them. In these cases, the most effective thing you can do is modularize those actions into tools that are easy to use when required. KQL functions in Azure Sentinel provide a way in which analysts can build up a collection of investigation tools to call upon quickly and simply.

 

So, what is a KQL function? It is simply a query that you create, save, and then assign an alias to call it by. An elementary example of how you might use a KQL function is a query that you run regularly, such as summarizing certain events over a given time period. Rather than typing out the full query each time, we can save it with an alias, allowing us to call the full query just by entering its name.

 

So how do you go about creating a KQL function? The first thing to do is to write a query. For this initial example, we are just going to write a straightforward query to get all the Security Events collected over the last week and return a list of event types and a count of how many times there were observed:

 

SecurityEvent
| where TimeGenerated >= ago(7d)
| summarize count() by Activity

 

To create a KQL function for this query, we simply enter it into the query box in the Logs blade of Azure Sentinel and click the Save button. This will open a new blade where we can choose to save this query as a function and assign it an alias by which we can call it.

 

Screenshot of the function save blade

 

Once saved, we can call the KQL function by simply typing the alias name into the query box and selecting Run.

 

Screenshot of the output of calling a function

 

You can keep track of the KQL functions you have saved by referring to the functions tab of the navigation pane within the Sentinel Logs blade:

 

Screenshot of the functions tab of the navigation pane

 

This example is not the best use of a KQL function, there are better ways to visualize that data, and we could choose to save it as a query rather than turn it into a function. But what if we had a more complex query, one that we didn’t want to have to memorize and type each time, and one that we would want to use in conjunction with other queries as part of our investigation?

 

A good example of this would be a parser. The most useful log sources are ones rich in data. However, these can be the most difficult to work with, as they often contain complex data. Office 365 Activity logs are a good example of this. They contain detail on an extensive range of different activities. As you can see in the example below, a lot of this data appears as JSON data under a single indexed field. 

 

Screenshot of JSON data within Office 365 logs

 

You can write a parser to parse these nested elements for every conceivable log, but that might not be practical or worthwhile, depending on the log source. However, an analyst may just want to write a parser for a subset of the data source that is particularly interesting from a security perspective. One such example I came across recently for Office logs related to the creation of mailbox forwarding rules.

 

Mailbox forwarding is a technique often used by attackers for data exfiltration. Once they have compromised an Office 365 account, they will log in and create rules to forward users mail to an external mailbox that they control, meaning that even if they lose their access to the Office 365 account, they will keep access to the victim’s email for as long as that rule is in place. Security analysts' problem is that mailbox rules are widely used for legitimate purposes, so finding those potentially malicious rules can be a time-consuming task. To make it as easy as possible, you ideally want the data in an easy to read and review format. So when I was conducting an investigation recently and wanted to look at mailbox forward activity, I created a parser to take that nested JSON data in the Office 365 Activity log and parse out the key data elements I wanted to see for analysis, namely what condition did the forwarding rule have, and where was it forwarding data:

OfficeActivity
    | where TimeGenerated >= ago(30d)
    | where Operation == 'New-InboxRule'
    | extend details = parse_json(Parameters)
    | where details contains 'ForwardTo' or details contains 'RedirectTo'
    | extend ForwardTo = iif(details[0].Name contains 'ForwardTo', details[0].Value,
        iif(details[1].Name contains 'ForwardTo', details[1].Value, 
            iif(details[2].Name contains 'ForwardTo', details[2].Value,  
                iif(details[3].Name contains 'ForwardTo', details[3].Value, 
                    iif(details[4].Name contains 'ForwardTo', details[4].Value,
                        'Check Parameters')))))
    | extend RedirectTo = iif(details[0].Name contains 'RedirectTo', details[0].Value,
        iif(details[1].Name contains 'RedirectTo', details[1].Value,
            iif(details[2].Name contains 'RedirectTo', details[2].Value,
                iif(details[3].Name contains 'RedirectTo', details[3].Value,
                    iif(details[4].Name contains 'RedirectTo', details[4].Value,
                        'Check Parameters')))))
    | extend RuleName = iif(details[3].Name contains 'Name', details[3].Value,
         iif(details[4].Name contains 'Name', details[4].Value,
            iif(details[5].Name contains 'Name', details[5].Value,
                'Check Parameters')))
    | extend RuleParameters = iif(details[2].Name != 'ForwardTo' and  details[2].Name != 'RedirectTo', 
        strcat(tostring(details[2].Name), '-', tostring(details[2].Value)),
        iif(details[3].Name != 'ForwardTo' and  details[3].Name != 'RedirectTo' and details[3].Name != 'Name',
            strcat(tostring(details[3].Name), '-', tostring(details[3].Value)), 
                iff(details[4].Name != 'ForwardTo' and details[4].Name != 'RedirectTo' and details[4].Name != 'Name' and details[4].Name != 'StopProcessingRules',
                strcat(tostring(details[4].Name), '-', tostring(details[4].Value)),
                'All Mail')))
    | project TimeGenerated, Operation, RuleName, RuleParameters, iif(details contains 'ForwardTo', ForwardTo, RedirectTo), ClientIP, UserId
    | project-rename Email_Forwarded_To = Column1, Creating_User = UserId

This parser produces the following output:

Screenshot of the output of the parser function in Azure Sentinel

Creating a parser like this requires that you spend some time identifying where in the JSON data, the elements you want to see could be. As such, I didn’t want to have to memorize it and type it out each time I wanted to look for this activity as part of an investigation. This made it an ideal query to save as a KQL function.

 

Screenshot of the function save blade

 

Now when conducting an investigation where I want to review all mailbox forwarding rules, I can simply call this parser by including mail_parse in my query.

 

In the below example, I am investigating a host that I suspect may have been compromised. I want to see if this host has been used to access my Office 365 tenant, and specifically whether any suspicious forwarding rules have been created during connections from this host. To do this, I write a query looking for all Office 365 activity where the Client IP is that of the host in question. I then call my parser by joining the query to the mail_parse function I created. This then selects the relevant data and parses out the fields I want to see,  allowing me to easily identify potential exfiltration of confidential data outside my organization. Whilst I don't need to in this example, I could also expand my investigation from here and look at further refining this search to look at rules created where the address being forward to is one outside of my control.

 

Screenshot of an Azure Sentinel query invoking the parser function

 

KQL functions are a quick and simple way to make repetitive actions simpler and quicker. They are one of the many ways that Azure Sentinel aims to make the job of a security analyst more efficient and effective.

 

More details on KQL functions can be found here:

https://docs.microsoft.com/en-us/azure/azure-monitor/log-query/functions

https://github.com/Azure/Azure-Sentinel/tree/master/Functions

Also, there are useful parsers on our GitHub site that you may want to consider converting to functions as well as some other prebuilt functions:

https://github.com/Azure/Azure-Sentinel/tree/master/Parsers

 

 

Updated Dec 29, 2020
Version 5.0
  • Hi Deepanshu_Marwah thanks for the question. No that's not possible right now but its great feedback, I will make sure it get to the relevant team.

  • Hi Pete Bryan 

     

    Can we use declared variables into functions and resuse those in multiple other functions. 

     

    For example : 

     

    Function1

     
    let NetworkParserUtilities = () {
    let x= y
    };
    NetworkParserUtilities

     

    Function2: 

    let NetworkParserCustomLogSource1 = () {
    customLogSource1 | <Parsing>
    };
    NetworkParserCustomLogSource1

    Function3

     

    let NetworkParserCustomLogSource2 = () {
    customLogSource1
    | <Parsing>
    };
    NetworkParserCustomLogSource2 

    Function 4: 

    let CombinedFunction = () {
    Function1,
    Function2,
    Function3
    };
    CombinedFunction 

    and finally running the query as :

     

     

    combined function 

    | where x = y // looking to call x through nested function.

     

     

    My Query: is that possible to call declared variables in one common function to another function? The requirement is to keep some analytical queries DRY (Do not repeat yourself)

     

     
  • FYI, the screenshots seemed to have died.  Also, was wondering if there might be any tips on whether there's a way to use functions to accomplish the same thing as Property Sets (e.g., from ExchangeOnlineManagement PS module's Get-EXO***** functions which have certain -PropertySets to pick between.

     

    For example, I can define the following as a function:

    EmailEvents
    | where SenderFromAddress =~ _SenderParameter
    | extend DMARC = parse_json(AuthenticationDetails).DMARC
    | extend SPF = parse_json(AuthenticationDetails).SPF
    | extend DKIM = parse_json(AuthenticationDetails).DKIM
    | extend CompAuth = parse_json(AuthenticationDetails).CompAuth
    | project Timestamp, EmailDirection,RecipientEmailAddress, Subject, SenderDisplayName, SenderFromAddress, SenderFromDomain, SenderMailFromDomain, SenderIPv4,
    DeliveryAction, SPF, DKIM, DMARC, CompAuth, BCL = BulkComplaintLevel, SCL = parse_json(ConfidenceLevel).Spam,PCL = parse_json(ConfidenceLevel).Phish,
    EmailAction, SpamDetectionMethod = parse_json(DetectionMethods).Spam, PhishDetectionMethod = parse_json(DetectionMethods).Phish,
    RecipientObjectId, AdditionalFields, InternetMessageId, NetworkMessageId, EmailClusterId

     ...but what I'd really like to do is just define this as a function, so that I could pump entire EmailEvents entities into the function, without having to control the filtering (i.e., where statements) in the function itself:

    | extend DMARC = parse_json(AuthenticationDetails).DMARC
    | extend SPF = parse_json(AuthenticationDetails).SPF
    | extend DKIM = parse_json(AuthenticationDetails).DKIM
    | extend CompAuth = parse_json(AuthenticationDetails).CompAuth
    | project Timestamp, EmailDirection,RecipientEmailAddress, Subject, SenderDisplayName, SenderFromAddress, SenderFromDomain, SenderMailFromDomain, SenderIPv4,
    DeliveryAction, SPF, DKIM, DMARC, CompAuth, BCL = BulkComplaintLevel, SCL = parse_json(ConfidenceLevel).Spam,PCL = parse_json(ConfidenceLevel).Phish,
    EmailAction, SpamDetectionMethod = parse_json(DetectionMethods).Spam, PhishDetectionMethod = parse_json(DetectionMethods).Phish,
    RecipientObjectId, AdditionalFields, InternetMessageId, NetworkMessageId, EmailClusterId

    I know my current example of what I've after is incorrect, but wondering if there's something like this possible?