Managing data integrity in cloud storage is critical, especially when working with large datasets across multiple environments. In this blog, I will share how I built an Azure Storage Manager using .NET to verify and manage file integrity with Azure Blob and File Storage. The solution uses Azure's Managed Identity feature to securely authenticate without any hardcoded credentials, while ensuring that every file stored in the cloud maintains data integrity through MD5 hashes.
Overview of the Application
The Azure Storage Manager is a console application that handles two main tasks: verifying data integrity and ensuring secure authentication. Here's a summary of its features:
- Managed Identity Authentication: Eliminates the need to manage keys or secrets by using Azure's identity framework.
- MD5 Hash Verification: Calculates MD5 hashes for local files and compares them against Azure metadata to ensure data consistency.
- CSV Report Generation: Outputs a comprehensive CSV report that shows file integrity status, with Match or Mismatch results for every file processed.
- Standalone Deployment: The application can be built into a single executable file, making it easy to run without any additional dependencies.
Setting Up Managed Identity for Authentication
A major design principle for this application was to avoid storing credentials directly in the codebase. Instead, I enabled Managed Identity on the Azure Virtual Machine running this application. Here's how to do that:
- Enable Managed Identity: Go to your VM in the Azure Portal, navigate to Identity, and enable the System-assigned managed identity.
- Assign Permissions: Assign the Storage Blob Data Contributor role to this identity for your Azure Storage Account under Access Control (IAM). This will allow the application to perform read and write operations securely.
- Use Azure SDK: In the .NET application, the Azure SDK's DefaultAzureCredential allows the Managed Identity to be used for authentication seamlessly.
Building the .NET Console Application
Here are the key parts of the Azure Storage Manager application:
- Project Structure: The application is a .NET 9 console app, consisting of service classes (BlobStorageService, FileShareService) and utilities for tasks like MD5 hash calculation and CSV export.
- Authentication Code: Using DefaultAzureCredential, we authenticate to Azure Storage securely:
This allows the application to request tokens from Azure's Managed Identity endpoint without having to manage sensitive information manually.var blobServiceClient = new BlobServiceClient(new Uri($"https://{storageAccountName}.blob.core.windows.net"), new DefaultAzureCredential());
- File Integrity Verification: The application generates MD5 hashes for local files, compares these with the hashes stored in Azure, and updates the metadata if it detects a mismatch. Here's a snippet showing how we update metadata when necessary:
if (blobHash == null || blobHash != localHash) { await blobClient.SetMetadataAsync(new Dictionary<string, string> { { "md5", localHash } }); blobProperties = await blobClient.GetPropertiesAsync(); // Confirm the update }
Generating the CSV Report
Once the integrity check is complete, the application generates a CSV report (BlobStorageReport.csv) with details for each file, including:- File Name
- Local MD5 Hash
- Remote MD5 Hash (from Azure Storage)
- Status (Match or Mismatch)
This provides a comprehensive overview of which files have been verified successfully and helps quickly identify any discrepancies.
Publishing the Application as a Standalone Executable
For deployment convenience, the application can be published as a single executable file. This application was built using .NET 9, providing the latest features and performance improvements. The following command builds the app into a single file with all necessary dependencies included:dotnet publish -c Release -r win-x64 --self-contained true /p:PublishSingleFile=true /p:IncludeAllContentForSelfExtract=true
This ensures the app is easy to run on any target machine without needing a pre-installed .NET runtime.
Challenges and Lessons Learned
- Metadata Verification: Initially, I found that metadata updates were not reflected in the first run. I modified the code to immediately re-query the blob properties after writing the metadata, ensuring the changes were correctly applied.
- Data Integrity Accuracy: During testing, I also learned that removing local file deletions allowed better verification without losing any data locally, which simplified troubleshooting.
- Managed Identity: Setting up Managed Identity and ensuring the appropriate roles were assigned to the VM was a crucial part of making the authentication seamless.
Conclusion
This Azure Storage Manager application demonstrates the power of combining Managed Identity for secure authentication with data integrity checks to ensure your files in Azure are always consistent with their on-premises versions. It removes the hassle of managing credentials and makes data integrity verification simple and efficient.If you want to explore the full code or adapt it to your own use case, head over to my GitHub repository (link here). Feel free to leave comments or ask questions—I’d love to hear your thoughts on how this could be improved or extended further.
Call to Action
- Try It Out: Download the code, set up your own Azure VM, and see how it works for you. GitHub: WernerRall147/AzureStorageManager: A tool that helps generate MD5 hashes to compare copied files
- Leave Feedback: Let me know if you have ideas for improvement or other scenarios where you could see this approach being helpful.