Blog Post

Azure PaaS Blog
4 MIN READ

Azure Storage Lifecycle Management (LCM) PrefixMatch Issues

hqaffesha's avatar
hqaffesha
Icon for Microsoft rankMicrosoft
Sep 06, 2023

Azure Storage lifecycle management (LCM) provides a rule-based policy enabling you to efficiently transition data between access tiers or expire it upon reaching the end of its lifecycle. These policies can be applied to a base blob, and optionally, to the blob's versions or snapshots. In this article, we'll delve into troubleshooting LCM Prefix issues and identifying common misconfigurations.

 

Understanding PrefixMatch

PrefixMatch is a pivotal component of Azure Storage LCM, wherein a prefix is defined to match specific blobs or containers. It's essential to recognize that prefix matching operates in a case-sensitive manner, adhering strictly to the specified rules. Throughout this article, we'll explore various storage content scenarios and analyze the outcomes when LCM is applied with the defined prefix for each case.

 

Disclaimer: It's essential to clarify that in Azure Blob Storage, the concept of physical folders doesn't exist; rather, it revolves around the notion of virtual folders created using the '/' character in the blob's name. For the sake of simplifying and conveying the structure of storage content meaningfully, the term "folder" is utilized in this article. It's important to maintain this clarity for ease of understanding.

 

Prefix Scenario 1: Folder1/Subfolder/

Storage content:

Folder1

  • Subfolder
    • Blob3
    • Blob4
    • Image1
    • Subfolder
      • Blob5
  • Blob1
  • Blob2

Results: 

Folder1

  • Subfolder
    • Blob3
    • Blob4
    • Image1
    • Subfolder*
      • Blob5
  • Blob1
  • Blob2

Every blob contained within the "subfolder" will be slated for deletion. The presence of a trailing forward slash in the rule signifies that it precisely matches an entire blob or container. This distinction ensures that only the specified content within "subfolder" will be affected by the deletion rule.

 

*Deleting the directory 'subfolder' could potentially encounter a 409 DirectoryIsNotEmpty error during the initial run of LCM. However, if it remains empty, it will be successfully deleted during the subsequent run of LCM.

 

Prefix Scenario 2: Folder1/sub

Storage content:

Folder1

  • subfolder
    • Blob3
    • Blob4
  • Subfolder
    • Blob5
  • Blob1
  • Blob2
  • subblob

 

Results:

Folder1

  • subfolder*
    • Blob3
    • Blob4
  • Subfolder
    • Blob5
  • Blob1
  • Blob2
  • subblob

The prefix defined here will match any item commencing with the term "sub" within the Folder1 container. It's crucial to be aware that prefix matching is case-sensitive. Consequently, it will remove the "subfolder" with a lowercase "s" but not the one with an uppercase "S."

 

*Deleting the directory 'subfolder' could potentially encounter a 409 DirectoryIsNotEmpty error during the initial run of LCM. However, if it remains empty, it will be successfully deleted during the subsequent run of LCM.

 

Prefix Scenario 3: Folder1/Subfolder/B?ob

Storage content:

Folder1

  • Subfolder
    • B?ob
    • Blob1
    • Blob2
    • Blob3

Results:

Folder1

  • Subfolder
    • B?ob
    • Blob
    • Blob1
    • Blob2

It's important to clarify that the character '?' is not a wildcard character in Azure Storage lifecycle management. Contrary to some wildcard systems, '?' does not represent a placeholder for a single occurrence of any character. In Azure Storage, '?' is considered a valid character in a blob name. Therefore, if you include '?' in a rule, it signifies a match for blobs that contain '?' in their blob names specifically. It does not function as a generic wildcard. Life cycle management policy does not support wildcard in prefixMatch.

 

Prefix Scenario 4: Folder1/Subfolder/Blob*

Storage content:

Folder1

  • Subfolder
    • Blob
    • Blob*
    • Blob1
    • Blob2

Results:

Folder1

  • Subfolder
    • Blob
    • Blob*
    • Blob1
    • Blob2


It's crucial to clarify that the character '*' is not a wildcard character in Azure Storage lifecycle management. Unlike some wildcard systems, '*' does not represent a placeholder for "match anything after its occurrence." In Azure Storage, '*' is considered a valid character in a blob name. Therefore, if you include '*' in a rule, it signifies a match for blobs that contain '*' in their blob names specifically. It does not function as a generic wildcard to match any characters. Life cycle management policy does not support wildcard in prefixMatch.

 

General troubleshooting and monitoring LCM

To gain more comprehensive insights into LCM operations and effectively monitor the targeted storage account, it's essential to enable log analytics. This will allow you to review the operations conducted on the storage account. For a detailed guide on setting up log analytics, please refer to the following documentation: Monitoring Azure Blob Storage - Azure Storage | Microsoft Learn

 

When examining the diagnostic logs, you can filter them based on the UserAgentHeader containing "ObjectLifeCycleScanner," which is the user agent associated with LCM operations. Here's an example query to check for deleted blobs that you can use as a reference:

 

 

 

StorageBlobLogs 
| where OperationName  == "DeleteBlob"
| where UserAgentHeader contains "ObjectLifeCycleScanner"
| where toint(StatusCode) > 202
| project TimeGenerated,OperationName,StatusCode,StatusText,Uri

 

 

 

By employing this filtering approach, you can efficiently isolate and assess the LCM-related operations within your diagnostic logs. An example output of the above query:

 

 

This log entry provides confirmation that the deletion of 'subfolder' encountered a 409 error due to the folder not being empty and containing one or more blobs. This situation mirrors the scenario outlined in "Prefix Scenario 2" as discussed in the article above.

Updated Sep 07, 2023
Version 2.0
  • Mike_K75's avatar
    Mike_K75
    Copper Contributor

    This is a very helpful article - Thank You.

     

    It would be awesome if you could add an example of briefly explain if functionality is the same or different if our storage is ADLS G2.

     

    As folders DO exist as separate entities in ADLS does this mean LCM filters work differently?

  • Tom_Luxton's avatar
    Tom_Luxton
    Copper Contributor

    Tested and confirmed that ‘Fold’ prefix will delete blobs in ‘Folder1’ etc.

  • Azure Lifecycle Management (LCM) cannot directly delete containers. LCM primarily focuses on managing the lifecycle of objects within storage containers only. So if you use Prefix "Fold" nothing will be deleted.

  • Tom_Luxton's avatar
    Tom_Luxton
    Copper Contributor

    What is the scenario for a prefix match of "Fold" will it find "Folder" container or does it need to match the entire container?

    Prefix Scenario Fold

    Storage content:

    Folder1

    • Subfolder
      • Blob
      • Blob*
      • Blob1
      • Blob2

    Results:

    all or nothing?