Leveraging Blob Inventory Report for Gaining Container Level Insights
Published Feb 21 2023 10:21 PM 3,352 Views
Microsoft

The Blob Inventory feature provides an overview of your containers, blobs, snapshots, and blob versions within a storage account. You can make use of the inventory report to understand various attributes of blobs and containers such as your total data size, age etc. depending upon the available fields for the inventory reports.

 

When it comes to gaining insights on breakup of blobs based on the type, access tier etc., there are certain limitations when it comes to the current available metrics. However, you can make use of the inventory report and merging that with the power of running SQL queries in Synapse help you with further drilldown by parsing these reports to that can help you taking further decisions ahead.

By default, the Blob Inventory feature is disabled. You need to enable the feature and based on our requirement, we can either enable the feature for blob or container level. You can refer to the link for the detailed steps that you need to configure the inventory report and you can opt only for the fields that will help with your use case:

Azure Storage blob inventory | Microsoft Learn

 

Now, before we move further in this blog, there are 2 pre-requisites:

  1. Enable the Blob Inventory feature for blobs/container along with all the required fields.
  2. Create an Azure Synapse Workspace.

Please note that the Blob Inventory and Azure Synapse Workspace (used in this blog) are paid features.

 

Let’s take a look at sample use cases to gain insights on objects inside a container. You will need a blob inventory report having a prefix rule match of the target container. In case you want to use this for account level, that’s also feasible as well.

 

Case1: We need a count breakup based on Access Tier

Query

SELECT AccessTier AS Tier,

        COUNT(*) As TotalBlobCount

FROM OPENROWSET(

    bulk '<URL to your inventory CSV file>',

    format='csv', parser_version='2.0', header_row=true

) AS Source

GROUP BY AccessTier

 

 

Output:

Amrinder_Singh_0-1676555941579.png

 

Case 2: We need a count breakup based on Blob Type

Query

SELECT BlobType AS BlobType,

        COUNT(*) As TotalBlobCount

FROM OPENROWSET(

    bulk '<URL to your inventory CSV file>',

    format='csv', parser_version='2.0', header_row=true

) AS Source

GROUP BY BlobType

 

 

Output:

Amrinder_Singh_1-1676555941587.png


Case 3: We need to identify blobs over a particular size. In the below sample we took that as 2 MB for now

SELECT Name, [Content-Length]

FROM OPENROWSET(

    bulk '<URL to your inventory CSV file>',

    format='csv', parser_version='2.0', header_row=true

) AS Source

WHERE [Content-Length]>2000000

 

 

Output:

Amrinder_Singh_2-1676555941589.png

 

Case 4: We need to count the blobs that were created in the year 2023.

SELECT LEFT([Name], CHARINDEX('/', [Name]) - 1) AS Container,

COUNT(Name) As TotalBlobCount

FROM OPENROWSET(

bulk '<URL to your inventory CSV file>',

format='csv', parser_version='2.0', header_row=true

) AS Source

where "Creation-Time" like '%2023-%'
GROUP BY LEFT([Name], CHARINDEX('/', [Name]) - 1)

 

Output

Amrinder_Singh_3-1676555941591.png

 

Similarly, if you want breakup of counts on a particular field of the report, it can be done is similar fashion. In case, you need details on the account level then you can tweak your report configurations and leverage the same queries ahead.

 

Hope this helps!

Version history
Last update:
‎Feb 21 2023 10:21 PM
Updated by: