KQL question

Copper Contributor

AzureActivity | summarize LastActivity = max(TimeGenerated) by ResourceProvider, ResourceGroup | join kind = innerunique( AzureActivity | summarize Operations = count() by ResourceGroup, ResourceProvider) on ResourceGroup, ResourceProvider |project ResourceProvider, ResourceGroup, Operations, LastActivity |sort by Operations

 

The above KQL is used to print 4 columns

I need to print the fifth column as well that highlights the percentage of operations per Resource Group and Resource provider. 

There have to 5 columns in the result

Resource Provider, Resource Group,Number of Operations (Activities), Last activity time, Percentage

 

Can someone help me with this?

 

 

16 Replies

@uditk14 

 

there may be a better solution, but this approach should work:

 

let TotalOperations = todouble(toscalar(AzureActivity | summarize count()));

AzureActivity
| summarize LastActivity = max(TimeGenerated), Operations = count() by ResourceProvider, ResourceGroup
| extend Percentage = round(todouble(Operations) / TotalOperations * 100, 1)
| project ResourceProvider, ResourceGroup, Operations, Percentage, LastActivity
| sort by Operations

@hspinto - Thanks a lot

 

I have one more query with the exteraldata operator

I have used externaldata operator to fetch data from a CSV having a few columns namely, IP ranges, country code, country name, continent name etc.

In Azure Activity table there is a CallerIP value.

I need to print the location for each caller Ip.

 

CSV file - https://datahub.io/core/geoip2-ipv4#premium-data-2

 

@hspinto Can you help me with the KQL

@uditk14,

 

something like this would respond to your needs. However, due to a restriction of user-defined functions, you cannot call functions sending parameters that depend on row-context.

 

 
let GeoData = externaldata (network:string,geoname_id:string,continent_code:string,continent_name:string,country_iso_code:string,country_name:string,is_anonymous_proxy:bool,is_satellite_provider:bool) [
@"https://datahub.io/core/geoip2-ipv4/r/geoip2-ipv4.csv"
] with(format="csv", ignoreFirstRecord=true);

let GetCountryName = (CallerIp:string) { toscalar(
    GeoData
    | extend AddressMask = split(network,'/')[1]
    | where ipv4_compare(CallerIp, tostring(split(network,'/')[0]), toint(tostring(split(network,'/')[1]))) == 0
    | project country_name )
};

//this works, because the parameter is hardcoded
//print GetCountryName('94.45.78.16')

// this will fail with a "Unresolved reference binding" error
AzureActivity
| extend CountryName = GetCountryName(CallerIpAddress)

 

@Deleted, do you have a solution for this one? 

@hspinto 


Would this work, it maps your IP from AzureActivity to the data from the CSV file?

externaldata (network:string,geoname_id:string,continent_code:string,continent_name:string,country_iso_code:string,country_name:string,is_anonymous_proxy:bool,is_satellite_provider:bool)
[@"https://datahub.io/core/geoip2-ipv4/r/geoip2-ipv4.csv"] with(format="csv", ignoreFirstRecord=true)
// select only the IP addr
| project geoNetworkip = tostring(split(network,"/").[0]), continent_name, continent_code
// join to AzureActicity Table
| join kind= inner 
    (
    AzureActivity
//    | project CallerIpAddress = "41.186.0.0"  // add a fake match to test 
      | project CallerIpAddress
    ) on $left.geoNetworkip == $right.CallerIpAddress
| project geoNetworkip, CallerIpAddress, continent_name, continent_code

 Go to Log Analytics and run query

Adapted from an old post of mine: https://cloudblogs.microsoft.com/industry-blog/en-gb/cross-industry/2019/08/13/azure-log-analytics-h...

@CliveWatson 

Thanks for the solution but this won't help in conditions with IPs falling in between the range. For instance, 47.7.8.8.

The csv contains the ranges and not direct IPs that can be mapped.

How to go about in finding the location for all the Ip addresses. Most of which fall in the mid of the ranges provided.

Hello @uditk14

I will stress this is just a sample, its not very optimized and there is probably a better way to do this (I just cant think of one currently - so I need to take a break from it, to help me think!)

// source idea: https://techcommunity.microsoft.com/t5/azure-sentinel/approximate-partial-and-combined-lookups-in-azure-sentinel/ba-p/1393795 
// get lookup data 
let geoData = 
    externaldata (network:string,geoname_id:string,continent_code:string,
                  continent_name:string,country_iso_code:string,
                  country_name:string,is_anonymous_proxy:bool,is_satellite_provider:bool)
    [@"https://datahub.io/core/geoip2-ipv4/r/geoip2-ipv4.csv"] with(format="csv", ignoreFirstRecord=true);
// now turn remote data to scalar 
let lookup = toscalar( geoData |  summarize list_CIDR=make_set(network) );
// link to Azure Activity and specifically CallerIpAddress  
AzureActivity
// get a small time range (this REALLY helps perf!!!!)
| where TimeGenerated > ago(2h)
| mv-apply list_CIDR=lookup to typeof(string) on
(
    // Match each IP from 'CallerIpAddress' with the remote 'network' column 
    where ipv4_is_match (CallerIpAddress, list_CIDR) //== false
)
// summarize to remove any duplicates
| summarize by CallerIpAddress, list_CIDR
| join kind=inner 
  (
  // join to remote data again, to add enrichments 
  geoData
  ) on $left.list_CIDR == $right.network
// build final display 
| summarize by CallerIpAddress, network, country_name, country_iso_code  

 

I'm struggling a bit with geo ip since it takes a big performance hit.
eg. using Clive's query above I'm given performance warnings even though I'm using it for just 1 hour of data which is about 4k rows.
And the output is just 32 rows.
I'd love if a geo lookup was built into KQL (like SPL does) or there was method that works over large volumes of data.
Are you bringing in TI feeds? https://docs.microsoft.com/en-us/azure/sentinel/whats-new#enriched-threat-intelligence-with-geolocat... These are now enriched with geo location and whois.

Below this there is a REST api https://docs.microsoft.com/en-us/azure/sentinel/geolocation-data-api
That's a GREAT point, thanks Clive!!!!

@SocInABox 

This Workbook I quickly created will demo the REST api, provide the geo details and map it for you 

Source: KQLpublic/geoLocation.workbook at master · clivewatson/KQLpublic (github.com)

Demo
geoLocation.gif

Super thanks again.
I was hoping there was a way to do this with kql, i.e query the threatintelligence table to get the country.
Or is this currently just an api feature to pull back a single IP at a time?
Ideally I'd like to pull a days worth of IPs from some log source and find the events that map to threatintelligence, and then show the related country.
Sorry that api (from the docs) has a limit, so its more for ad-hoc queries than your use case of a days worth:

This API has a limit of 100 calls, per user, per hour.

no problem, this query below will work for now, I'll just use it with short time periods.
If you'd like to suggest a cleaner way to do this I'd be interested, but it seems to work ok.
It's based on your work I think, and then I tweaked it at the end for fortinet logs.
let geoData =
materialize (externaldata(network:string,geoname_id:string,continent_code:string,continent_name:string,
country_iso_code:string,country_name:string,is_anonymous_proxy:string,is_satellite_provider:string)
[@"https://raw.githubusercontent.com/datasets/geoip2-ipv4/master/data/geoip2-ipv4.csv"] with
(ignoreFirstRecord=true, format="csv"));
// create array of network CIDRs from the geoip list and assign it to "lookup":
let lookup = toscalar( geoData | summarize list_CIDR=make_set(network) );
CommonSecurityLog|where DeviceVendor == "Fortinet"
//filter out private networks
|where not(ipv4_is_private(SourceIP)) and not(ipv4_is_private(DestinationIP))
|summarize by SourceIP
| mv-apply list_CIDR=lookup to typeof(string) on
(
//match IPs to getData CIDRs
where ipv4_is_match(SourceIP, list_CIDR) //== false
)
//append the geoData to the matched IPs
|join geoData on $left.list_CIDR == $right.network

@SocInABox 

 

I just realised the original query was before we had ipv4_lookup(), so does this change improve things (its less code at least)?

let IP_Data = external_data(network:string,geoname_id:long,continent_code:string,continent_name:string ,country_iso_code:string,country_name:string,is_anonymous_proxy:bool,is_satellite_provider:bool)
    ['https://raw.githubusercontent.com/datasets/geoip2-ipv4/master/data/geoip2-ipv4.csv'];
let IPs = 
    CommonSecurityLog
    |where DeviceVendor == "Fortinet"
    //filter out private networks
    |where not(ipv4_is_private(SourceIP)) and not(ipv4_is_private(DestinationIP))
    |summarize by SourceIP
;
IPs
| evaluate ipv4_lookup(IP_Data, SourceIP, network, return_unmatched = true)
That works, thanks! I'll just have to add a filter for loopbacks, bogons, etc.