SOLVED

How to measure egress for Storage Account and whether it has exceeded x GiB in y minutes?

Brass Contributor

Hello,

 

I am trying to find a KQL query that can scan any Storage Account and verify, through an alert metric, whether it has exceeded x GiB in y minutes.

 

I know it is possible to set up an alert 'metric' in Azure Monitor to verify if Blob Storage egress has exceeded 500 GiB in one day, but how is that written as a KQL query?

 

I have started looking at queries like this...

 

//Time range added to look last 24 hours (in 6-hour time intervals) from the previous day and read size on the number of bytes that is over 10000000000 on each container
StorageBlobLogs
| where TimeGenerated between ( startofday(ago(48hrs)) .. endofday(ago(1d)) )
| where OperationName == "GetBlob"
| extend ContainerName = split(parse_url(Uri).Path, "/")[1]
| summarize ReadSize = sum(ResponseBodySize) by tostring(ContainerName), bin(TimeGenerated, 6hr)
| where ReadSize > 10000000000
| render timechart
| order by ReadSize desc

 

I just don't know how to amend this so it can 'scan' a time period (say over 6 hours) where I read the Storage Account size at the start and end of the time period and verify IF the egress has gone over a set number of bytes. 

6 Replies
Do you mean, this (last two lines)? You can then check the size of diff_ and only show results over nnnnnn

StorageBlobLogs
| where TimeGenerated between ( startofday(ago(48hrs)) .. endofday(ago(1d)) )
| where OperationName == "GetBlob"
| extend ContainerName = split(parse_url(Uri).Path, "/")[1]
| summarize ReadSize = sum(ResponseBodySize) by tostring(ContainerName), bin(TimeGenerated, 6hr)
| where ReadSize > 10000000000
| serialize
| extend diff_ = prev(ReadSize) - ReadSize
Hi Clive,

Sorry for the slight delay in response; I have had some family commitments.

I am ideally looking for a check on blob storage egress over a timeframe (say 24 hours; it could be anything) that has exceeded 'x' GB in 'y' minutes DURING that timeframe, and possibly users who have performed the read/write actions to cause this.

So, for example, check on a blob storage egress over 24 hours and see if it has exceeded '4' GB in '60' minutes DURING that 24-hour timeframe and by which users who have performed the read/write actions to cause it.

I understand why you have suggested the 'serialize' and 'prev' functions and not sure if that meets the criteria above.

I have slightly adjusted my 'summarize' command line to refer to the Account Name now, but I still require some assistance on the time-series element and possibly identifying users performing the actions I mentioned.

//Time range added to look last 24 hours (in 6-hour time intervals) from the previous day and read size on the number of bytes that is over 10000000000 on each account
StorageBlobLogs
| where TimeGenerated between ( startofday(ago(48hrs)) .. endofday(ago(1d)) )
| where OperationName == "GetBlob"
| summarize ReadSize = sum(ResponseBodySize) by AccountName, bin(TimeGenerated, 6hr)
| where ReadSize > 10000000000
| render timechart
| order by ReadSize desc
How about?

StorageBlobLogs
| where OperationName == "GetBlob"
| make-series ReadSize = sum(ResponseBodySize) default=0 on TimeGenerated from startofday(ago(2d)) to endofday(ago(1d)) step 6h by AccountName
| project series_stats(ReadSize), AccountName
| where series_stats_ReadSize_max > 10000000000
// use can use min, avg instaed of max
Hi Clive.

Oh, I like that with the series_stats, mate.

What I can still cannot extract are the users performing the read/write 'actions'. The nearest property I can see to a user is the CallerIpAddress. This returns the IP address of the requester, including the port number, assuming the requester is a user!

Have not tested this yet but a query like something like:

let users=
StorageBlobLogs
| where TimeGenerated between ( startofday(ago(48hrs)) .. endofday(ago(1d)) )
| where OperationName == "GetBlob"
| summarize ReadSize = sum(ResponseBodySize) by AccountName, bin(TimeGenerated, 6hr)
| where ReadSize > 10000000000
| distinct CallerIpAddress;
StorageBlobLogs
| where TimeGenerated between ( startofday(ago(48hrs)) .. endofday(ago(1d)) )
| where OperationName == "GetBlob" and CallerIpAddress in~ (users)

Jason
You might get lucky and find logon details in one of the AAD tables - like SigninLogs, but if not the IP maybe your only clue

Add this to the end of the query

...
| where OperationName == "GetBlob" and CallerIpAddress in~ (users)
| extend IPAddress = tostring(split(CallerIpAddress,':')[0])
|join
(
SigninLogs // try other Tables as well
| where isnotempty(IPAddress)
)
on IPAddress
best response confirmed by JMSHW0420 (Brass Contributor)
Solution
Hi Clive,

The below query actually provides the solution I require.

StorageBlobLogs
| where TimeGenerated between ( startofday(ago(2d)) .. endofday(ago(1d)) )
| where OperationName == "GetBlob"
| extend IPAddress = tostring(split(CallerIpAddress,':')[0])
| join
(
SigninLogs
| where isnotempty(IPAddress)
)
on IPAddress
| summarize ReadSize = sum(ResponseBodySize) by AccountName, UserPrincipalName, bin(TimeGenerated, 6hr)
| where ReadSize > 10000

Thanks for your help.
1 best response

Accepted Solutions
best response confirmed by JMSHW0420 (Brass Contributor)
Solution
Hi Clive,

The below query actually provides the solution I require.

StorageBlobLogs
| where TimeGenerated between ( startofday(ago(2d)) .. endofday(ago(1d)) )
| where OperationName == "GetBlob"
| extend IPAddress = tostring(split(CallerIpAddress,':')[0])
| join
(
SigninLogs
| where isnotempty(IPAddress)
)
on IPAddress
| summarize ReadSize = sum(ResponseBodySize) by AccountName, UserPrincipalName, bin(TimeGenerated, 6hr)
| where ReadSize > 10000

Thanks for your help.

View solution in original post