Azure Storage Lifecycle Management (LCM) PrefixMatch Issues
Published Sep 06 2023 06:15 AM 7,844 Views
Microsoft

Azure Storage lifecycle management (LCM) provides a rule-based policy enabling you to efficiently transition data between access tiers or expire it upon reaching the end of its lifecycle. These policies can be applied to a base blob, and optionally, to the blob's versions or snapshots. In this article, we'll delve into troubleshooting LCM Prefix issues and identifying common misconfigurations.

 

Understanding PrefixMatch

PrefixMatch is a pivotal component of Azure Storage LCM, wherein a prefix is defined to match specific blobs or containers. It's essential to recognize that prefix matching operates in a case-sensitive manner, adhering strictly to the specified rules. Throughout this article, we'll explore various storage content scenarios and analyze the outcomes when LCM is applied with the defined prefix for each case.

 

Disclaimer: It's essential to clarify that in Azure Blob Storage, the concept of physical folders doesn't exist; rather, it revolves around the notion of virtual folders created using the '/' character in the blob's name. For the sake of simplifying and conveying the structure of storage content meaningfully, the term "folder" is utilized in this article. It's important to maintain this clarity for ease of understanding.

 

Prefix Scenario 1: Folder1/Subfolder/

Storage content:

Folder1

  • Subfolder
    • Blob3
    • Blob4
    • Image1
    • Subfolder
      • Blob5
  • Blob1
  • Blob2

Results: 

Folder1

  • Subfolder
    • Blob3
    • Blob4
    • Image1
    • Subfolder*
      • Blob5
  • Blob1
  • Blob2

Every blob contained within the "subfolder" will be slated for deletion. The presence of a trailing forward slash in the rule signifies that it precisely matches an entire blob or container. This distinction ensures that only the specified content within "subfolder" will be affected by the deletion rule.

 

*Deleting the directory 'subfolder' could potentially encounter a 409 DirectoryIsNotEmpty error during the initial run of LCM. However, if it remains empty, it will be successfully deleted during the subsequent run of LCM.

 

Prefix Scenario 2: Folder1/sub

Storage content:

Folder1

  • subfolder
    • Blob3
    • Blob4
  • Subfolder
    • Blob5
  • Blob1
  • Blob2
  • subblob

 

Results:

Folder1

  • subfolder*
    • Blob3
    • Blob4
  • Subfolder
    • Blob5
  • Blob1
  • Blob2
  • subblob

The prefix defined here will match any item commencing with the term "sub" within the Folder1 container. It's crucial to be aware that prefix matching is case-sensitive. Consequently, it will remove the "subfolder" with a lowercase "s" but not the one with an uppercase "S."

 

*Deleting the directory 'subfolder' could potentially encounter a 409 DirectoryIsNotEmpty error during the initial run of LCM. However, if it remains empty, it will be successfully deleted during the subsequent run of LCM.

 

Prefix Scenario 3: Folder1/Subfolder/B?ob

Storage content:

Folder1

  • Subfolder
    • B?ob
    • Blob1
    • Blob2
    • Blob3

Results:

Folder1

  • Subfolder
    • B?ob
    • Blob
    • Blob1
    • Blob2

It's important to clarify that the character '?' is not a wildcard character in Azure Storage lifecycle management. Contrary to some wildcard systems, '?' does not represent a placeholder for a single occurrence of any character. In Azure Storage, '?' is considered a valid character in a blob name. Therefore, if you include '?' in a rule, it signifies a match for blobs that contain '?' in their blob names specifically. It does not function as a generic wildcard. Life cycle management policy does not support wildcard in prefixMatch.

 

Prefix Scenario 4: Folder1/Subfolder/Blob*

Storage content:

Folder1

  • Subfolder
    • Blob
    • Blob*
    • Blob1
    • Blob2

Results:

Folder1

  • Subfolder
    • Blob
    • Blob*
    • Blob1
    • Blob2


It's crucial to clarify that the character '*' is not a wildcard character in Azure Storage lifecycle management. Unlike some wildcard systems, '*' does not represent a placeholder for "match anything after its occurrence." In Azure Storage, '*' is considered a valid character in a blob name. Therefore, if you include '*' in a rule, it signifies a match for blobs that contain '*' in their blob names specifically. It does not function as a generic wildcard to match any characters. Life cycle management policy does not support wildcard in prefixMatch.

 

General troubleshooting and monitoring LCM

To gain more comprehensive insights into LCM operations and effectively monitor the targeted storage account, it's essential to enable log analytics. This will allow you to review the operations conducted on the storage account. For a detailed guide on setting up log analytics, please refer to the following documentation: Monitoring Azure Blob Storage - Azure Storage | Microsoft Learn

 

When examining the diagnostic logs, you can filter them based on the UserAgentHeader containing "ObjectLifeCycleScanner," which is the user agent associated with LCM operations. Here's an example query to check for deleted blobs that you can use as a reference:

 

 

 

StorageBlobLogs 
| where OperationName  == "DeleteBlob"
| where UserAgentHeader contains "ObjectLifeCycleScanner"
| where toint(StatusCode) > 202
| project TimeGenerated,OperationName,StatusCode,StatusText,Uri

 

 

 

By employing this filtering approach, you can efficiently isolate and assess the LCM-related operations within your diagnostic logs. An example output of the above query:

 

hqaffesha_0-1693809382409.png

 

This log entry provides confirmation that the deletion of 'subfolder' encountered a 409 error due to the folder not being empty and containing one or more blobs. This situation mirrors the scenario outlined in "Prefix Scenario 2" as discussed in the article above.

4 Comments
Co-Authors
Version history
Last update:
‎Sep 07 2023 01:39 PM
Updated by: