If you are new to Azure Storage Actions, check out our GA announcement blog for an introduction. This post is for cloud architects, data engineers, and IT admins who want to automate and optimize data governance at scale.
The Challenge: Modern Data Management at Scale
As organizations generate more data than ever, managing that data efficiently and securely is a growing challenge. Manual scripts, periodic audits, and ad-hoc cleanups can’t keep up with the scale, complexity, and compliance demands of today’s cloud workloads. Teams need automation that’s reliable, scalable, and easy to maintain.
Azure Storage Actions delivers on this need by enabling policy-driven automation for your storage accounts. With Storage Actions, you can:
- Automate compliance (e.g., legal holds, retention)
- Optimize storage costs (e.g., auto-tiering, expiry)
- Reduce operational overhead (no more custom cleanup scripts)
- Improve data discoverability (tagging, labeling)
Real-World Scenarios: Unlocking the Power of Storage Actions
Let’s explore 3 practical scenarios where Storage Actions can transform customers’ data management approach. For each, we’ll look at the business problem, the traditional approach, and how Storage Actions makes it easier with the exact conditions and operations which can be used.
Scenario 1: Content Lifecycle for Brand Teams
Business Problem:
Brand and marketing teams manage large volumes of creative assets - videos, design files, campaign materials that evolve through multiple stages and often carry licensing restrictions. These assets need to be retained, frozen, or archived based on their lifecycle and usage rights. Traditionally, teams rely on scripts or manual workflows to manage this, which can be error-prone, slow, and difficult to scale.
How Storage Actions Helps:
Azure Storage Actions enables brand teams to automate the content lifecycle management using blob metadata and / or index tag. With a single task definition using an IF and ELSE structure, teams can apply different operations to blobs based on their stage, licensing status, and age without writing or maintaining scripts.
Example in Practice:
Let’s say a brand team manages thousands of creative assets videos, design files, campaign materials each tagged with blob metadata that reflects its lifecycle stage and licensing status. For instance:
- Assets that are ready for public use are tagged with asset-stage = final
- Licensed or restricted-use content is tagged with usage-rights = restricted
Over time, these assets accumulate in your storage account, and you need a way to:
- Ensure that licensed content is protected from accidental deletion or modification
- Archive older final assets to reduce storage costs
- Apply these rules automatically, without relying on scripts or manual reviews
With Azure Storage Actions, the team can define a single task that evaluates each blob and applies the appropriate operation using a simple IF and ELSE structure:
IF:
- Metadata.Value["asset-stage"] equals "final"
- AND Metadata.Value["usage-rights"] equals “restricted”
- AND creationTime < 60d
THEN:
- SetBlobLegalHold: This locks the blob to prevent deletion or modification, ensuring compliance with licensing agreements.
- SetBlobTier to Archive: This moves the blob to the Archive tier, significantly reducing storage costs for older content that is rarely accessed.
ELSE
- SetBlobTier to Cool: If the blob does not meet the above criteria whether it’s a draft, unlicensed, or recently created, it is moved to the Cool tier.
Once this Storage Action is created and assigned to a storage account, it is scheduled to run automatically every week. During each scheduled run, the task evaluates every blob in the target container or account. For each blob, it checks if the asset is marked as final, tagged with usage-rights, and older than 60 days. If all these conditions are met, the blob is locked with a legal hold to prevent accidental deletion and then archived to optimize storage costs. If the blob does not meet all of these criteria, it is moved to the Cool tier, ensuring it remains accessible but stored more economically. This weekly automation ensures that every asset is managed appropriately based on its metadata, without requiring manual intervention or custom scripts.
Scenario 2: Audit-Proof Model Training
Business Problem:
In machine learning workflows, ensuring the integrity and reproducibility of training data is critical especially when models influence regulated decisions in sectors like automotive, finance, healthcare, or legal compliance. Months or even years after a model is deployed, auditors or regulators may request proof that the training data used has not been altered since the model was built.
Traditionally, teams try to preserve training datasets by duplicating them into backup storage, applying naming conventions, and manually restricting access. These methods are error-prone, hard to enforce at scale, and lack auditability.
How Storage Actions Helps:
Storage Actions enables teams to automate the preservation of validated training datasets using blob tags and immutability policies. Once a dataset is marked as clean and ready for training, Storage Actions can automatically:
- Lock the dataset using a time-based immutability policy
- Apply a tag to indicate it is a snapshot version
This ensures that the dataset cannot be modified or deleted for the duration of the lock, and it is easily discoverable for future audits.
Example in Practice:
Let’s say an ML data pipeline tags a dataset with stage=clean after it passes validation and is ready for training. Storage Actions detects this tag and springs into action.
- It enforces a 1-year immutability policy, which means the dataset is locked and cannot be modified or deleted for the next 12 months.
- It also applies a tag snapshot=true, making it easy to locate and reference in future audits or investigations.
The following conditions and operations define the task logic:
IF:
- Tags.Value[stage] equals 'clean'
THEN:
- SetBlobImmutabilityPolicy for 1-year: This adds a write once, read many (WORM) immutability policy on the blob to prevent deletion or modification, ensuring compliance.
- SetBlobTags with snapshot=true: This adds a blob index tag with name “snapshot” and value “true”.
Whenever this task runs on its scheduled interval - such as daily or weekly, it detects if a blob has the tag stage = 'clean', it automatically initiates the configured operations. In this case, Storage Actions applies a SetBlobImmutabilityPolicy on the blob for one year and adds a snapshot=true tag for easy identification.
This means that without any manual intervention:
- The blob is made immutable for 12 months, preventing any modifications or deletions during that period.
- A snapshot=true tag is applied, making it easy to locate and audit later.
- No scripts, manual tagging, or access restrictions are needed to enforce data integrity.
This ensures that validated training datasets are preserved in a tamper-proof state, satisfying audit and compliance requirements. It also reduces operational overhead by automating what would otherwise be a complex and error-prone manual process.
Scenario 3: Embedding Management in AI Workflows
Business Problem:
Modern AI systems, especially those using Retrieval-Augmented Generation (RAG), rely heavily on vector embeddings to represent and retrieve relevant context from large document stores. These embeddings are often generated in real time, chunked into small files, and stored in vector databases or blob storage. As usage scales, these systems generate millions of small embedding files, many of which become obsolete quickly due to frequent updates, re-indexing, or model version changes.
This silent accumulation of stale embeddings leads to:
- Increased storage costs
- Slower retrieval performance
- Operational complexity in managing the timings
Traditionally, teams write scripts to purge old embeddings based on timestamps, run scheduled jobs, and manually monitor usage. This approach is brittle and does not scale well.
How Storage Actions Helps:
Storage Actions enables customers to automate the management of embeddings using blob tags and metadata. With blobs being identified with tags and metadata such as embeddings=true, modelVersion=latest, customers can define conditions that automatically delete stale embeddings without writing custom scripts.
Example in Practice:
In production RAG systems, embeddings are frequently regenerated to reflect updated content, new model versions, or refined chunking strategies. For example, a customer support chatbot may re-index its knowledge base daily to ensure responses are grounded in the latest documentation.
To avoid bloating storage with outdated vector embeddings, Storage Actions can automate cleanup with task conditions and operation such as:
IF:
- Tags.Value[embeddings] equals 'true'
- AND NOT Tags.Value[version] equals ‘latest’
- AND creation time < 12 days ago
THEN:
- DeleteBlob: This deletes all blobs which match the IF condition criteria.
Whenever this Storage Action runs on its scheduled interval - such as daily - it scans for blobs that have the tag embeddings = ‘true’ and is not the latest version with its age being more than 12 days old, it automatically initiates the configured operation. In this case, Storage Actions does a DeleteBlob operation on the blob.
This means that without any manual intervention:
- The stale embeddings are deleted
- No scripts or scheduled jobs are needed to track.
This ensures that only the most recent model’s embeddings are retained, keeping the vector store lean and performant. It also reduces storage costs by eliminating obsolete data and helps maintain retrieval accuracy by ensuring outdated embeddings do not interfere with current queries.
Applying Storage Actions to Storage Accounts
To apply any of the scenarios, customers create an assignment during the storage task resource creation. In the assignment creation flow, they select the appropriate role and configure filters and trigger details.
For example, a compliance cleanup scenario might run across the entire storage account with a recurring schedule every seven days to remove non-compliant blobs. A cost optimization scenario could target a specific container using a blob prefix and run as a one-time task to archive older blobs. A bulk tag update scenario would typically apply to all blobs without filtering and use a recurring schedule to keep tags consistent. After setting start and end dates, specifying the export container, and enabling the task, clicking Add queues the action to run on the account.
Learn More
If you are interested in exploring Storage Actions further, there are several resources to help you get started and deepen your understanding:
- Documentation on Getting Started: https://learn.microsoft.com/en-us/azure/storage-actions/storage-tasks/storage-task-quickstart-portal
- Create a Storage Action from the Azure Portal: https://portal.azure.com/#create/Microsoft.StorageTask
- Azure Storage Actions pricing: https://azure.microsoft.com/en-us/pricing/details/storage-actions/#pricing
- Azure Blog about the GA announcement: https://azure.microsoft.com/en-us/blog/unlock-seamless-data-management-with-azure-storage-actions-now-generally-available/
- Azure Skilling Video with a walkthrough of Storage Actions: https://www.youtube.com/watch?v=CNdMFhdiNo8
Have questions, feedback, or a scenario to share?
Drop a comment below or reach out to us at storageactions@microsoft.com. We would love to hear how you are using Storage Actions and what scenarios you would like to see next!