Azure Files
30 TopicsHow to Save 70% on File Data Costs
In the final entry in our series on lowering file storage costs, DarrenKomprise shares how Komprise can help lower on-premises and Azure-based file storage costs. Komprise and Azure offer you a means to optimize unstructured data costs now and in the future!14KViews1like1CommentHybrid File Tiering Addresses Top CIO Priorities of Risk Control and Cost Optimization
Hybrid File Tiering addresses top CIO priorities of risk control and cost optimization This article describes how you can leverage Komprise Intelligent Tiering for Azure with any on-premises file storage platform and Azure Blob Storage to reduce your cost by 70% and shrink your ransomware attack surface. Note: This article has been co-authored by Komprise and Microsoft. Unstructured data plays a big role in today's IT budgets and risk factors Unstructured data, which is any data that does not fit neatly into a database or tabular format, has been growing exponentially and is now projected by analysts to be over 80% of business information. Unstructured data is commonly referred to as file data, which is the terminology used for the rest of this article. File data has caught some IT leaders by surprise because it is now consuming a significant portion of IT budgets with no sign of slowing down. File data is expensive to manage and retain because it is typically stored and protected by replication to an identical storage platform which can be very expensive at scale. We will now review how you can easily identify hot and cold data and transparently tier cold files to Azure to cut costs and shrink ransomware exposure with Komprise. Why file data is factoring into CIO priorities CIOs are prioritizing cost optimization, risk management and revenue improvement as key priorities for their data. 56% chose cost optimization as their top priority according to the 2024 Komprise State of Unstructured Data Management survey. This is because file data is often retained for decades, its growth rate is in double-digits, and it can easily be petabytes of data. Keeping a primary copy, a backup copy and a DR copy means three or more copies of the large volume of file data which becomes prohibitively expensive. On the other hand, file data has largely been untapped in terms of value, but businesses are now realizing the importance of file data to train and fine tune AI models. Smart solutions are required to balance these competing requirements. Why file data is vulnerable to ransomware attacks File data is arguably the most difficult data to protect against ransomware attacks because it is open to many different users, groups and applications. This increases risk because a single user's or group's mistake can lead to a ransomware infection. If the file is shared and accessed again, the infection can quickly spread across the network undetected. As ransomware lurks, the risk increases. For these reasons, you cannot ignore file data when creating a ransomware defense strategy. How to leverage Azure to cut the cost and inherent risk of file data retention You can cut costs and shrink the ransomware attack surface of file data using Azure even when you still require on-premises access to your files. The key is reducing the amount of file data that is actively accessed and thus exposed to ransomware attacks. Since 80% of file data is typically cold and has not been accessed in months (see Demand for cold data storage heats up | TechTarget), transparently offloading these files to immutable storage through hybrid tiering cuts both costs and risks. Hybrid tiering offloads entire files from the data storage, snapshot, backup and DR footprints while your users continue to see and access the tiered files without any change to your application processes or user behavior. Unlike storage tiering which is typically offered by the storage vendor and causes blocks of files to be controlled by the storage filesystem to be placed in Azure, hybrid tiering operates at the file level and transparently offloads the entire file to Azure while leaving behind a link that looks and behaves like the file itself. Hybrid tiering offloads cold files to Azure to cut costs and shrink the ransomware attack surface: Cut 70%+ costs: By offloading cold files and not blocks, hybrid tiering can shrink the amount of data you are storing and backing up by 80%, which cuts costs proportionately. As shown in the example below, you can cut 70% of file storage and backup costs by using hybrid tiering. Assumptions Amount of Data on NAS (TB) 1024 % Cold Data 80% Annual Data Growth Rate 30% On-Prem NAS Cost/GB/Mo $0.07 Backup Cost/GB/Mo $0.04 Azure Blob Cool Cost/GB/Mo $0.01 Komprise Intelligent Tiering for Azure/GB/Mo $0.008 On-Prem NAS On-prem NAS + Azure Intelligent Tiering Data in On-Premises NAS 1024 205 Snapshots 30% 30% Cost of On-Prem NAS Primary Site $1,064,960 $212,992 Cost of On-Prem NAS DR Site $1,064,960 $212,992 Backup Cost $460,800 $42,598 Data on Azure Blob Cool $0 819 Cost of Azure Blob Cool $0 $201,327 Cost of Komprise $100,000 Total Cost for 1PB per Year $2,590,720 $769,909 SAVINGS/PB/Yr $1,820,811 70% Shrink ransomware attack surface by 80%: Offloading cold files to immutable Azure Blob removes cold files from the active attack surface thus eliminating 80% of the storage, DR and backup costs while also providing a potential recovery path if the cold files get infected. By having Komprise tier to immutable Azure Blob with versioning, even if someone tried to infect a cold file, it would be saved as a new version – enabling recovery using an older version. Learn more about Azure Immutable Blob storage here. In addition to cost savings and improved ransomware defense, the benefits of Hybrid Cloud Tiering using Komprise and Azure are: Leverage Existing Storage Investment: You can continue to use your existing NAS storage and Komprise to tier cold files to Azure. Users and applications continue to see and access the files as if they were still on-premises. Leverage Azure Data Services: Komprise maintains file-object duality with its patented Transparent Move Technology (TMT), which means the tiered files can be viewed and accessed in Azure as objects, allowing you to use Azure Data Services natively. This enables you to leverage the full power of Azure with your enterprise file data. Works Across Heterogeneous Vendor Storage: Komprise works across all your file and object storage to analyze and transparently tier data to Azure file and object tiers. Ongoing Lifecycle Management in Azure: Komprise continues to manage data lifecycle in Azure, so as data gets colder, it can move from Azure Blob Cool to Cold to Archive tier based on policies you control. Azure and Komprise customers are already using hybrid tiering to improve their ransomware posture while reducing costs – a great example is Katten. Global law firm saves $900,000 per year and achieves resilient ransomware defense with Komprise and Azure Katten Muchin Rosenman LLP (Katten) is a full-service law firm delivering legal services across more than a dozen practice areas and sectors, including Aviation, Construction, Energy, Education, Entertainment, Healthcare and Real Estate. Like many other large law firms, Katten has been seeing an average 20% annual growth in storage for file related data, resulting in the need to add on-premises storage capacity every 12-18 months. With a focus on managing data storage costs in an environment where data is growing exponentially annually but cannot be deleted, Katten needed a solution that could provide deep data insights and the ability to move file data as it ages to immutable object storage in the cloud for greater cost savings and ransomware protection. Katten Law implemented hybrid tiering using Komprise Intelligent Tiering to Azure and leveraged Immutable Blob storage to not only save $900,000 annually but also improved their ransomware defense posture. Read how Katten Law does hybrid tiering to Azure using Komprise. Summary: Hybrid Tiering helps CIOs to optimize file costs and cut ransomware risks Cost optimization and Risk management are top CIO priorities. File data is a major contributor to both costs and ransomware risks. Organizations are leveraging Komprise to tier cold files to Azure while continuing to use their on-premises file storage NAS. This provides a low risk approach with no disruption to users and apps while cutting 70% costs and shrinking the ransomware attack surface by 80%. Next steps To learn more and get a customized assessment of your savings, visit the Azure Marketplace listing or contact azure@komprise.com.224Views3likes0CommentsGeneral Availability: Azure Active Directory Kerberos with Azure Files for hybrid identities
We are excited to announce General Availability of Azure Files integration with Azure Active Directory (Azure AD) Kerberos for hybrid identities. With this release, identities in Azure AD can mount and access Azure file shares without the need for line-of-sight to an Active Directory domain controller.37KViews11likes36CommentsAzure Files provisioned v2 billing model for flexibility, cost savings, and predictability
We are excited to announce the general availability of the Azure Files provisioned v2 billing model for the HDD (standard) media tier. Provisioned v2 offers a provisioned billing model, meaning that you pay for what you provision, which enables you to flexibly provision storage, IOPS, and throughput. This allows you to migrate your general-purpose workloads to Azure at the best price and performance, but without sacrificing price predictability. With provisioned v2, you have granular control to scale your file share alongside your workload needs – whether you are connecting from a remote client, in hybrid mode with Azure File Sync, or running an application in Azure. The provisioned v2 model enables you to dynamically scale up or down your application’s performance as needed, without downtime. Provisioned v2 file shares can span from 32 GiB to 256 TiB in size, with up to 50,000 IOPS and 5 GiB/sec throughput, providing the flexibility to handle both small and large workloads. If you’re an existing user of Azure Files, you may be familiar with the current “pay-as-you-go” model for the HDD (standard) media tier. While conceptually, this model is simple – you pay for the storage and transactions used – usage-based pricing can be incredibly challenging to understand and use because it’s very difficult or impossible to accurately predict the usage on a file share. Without knowing how much usage you will drive, especially in terms of transactions, you can’t make accurate predictions about your Azure Files bill ahead of time, making planning and budgeting difficult. The provisioned v2 model solves all these problems – and more! Increased scale and performance In addition to the usability improvements of a provisioned model, we have significantly increased the limits over the current “pay-as-you-go” model: Quantity HDD pay-as-you-go HDD provisioned v2 Maximum share size 100 TiB (102,400 GiB) 256 TiB (262,144 GiB) Maximum share IOPS 40,000 IOPS (recently increased from 20,000 IOPS) 50,000 IOPS Maximum share throughput Variable based on region, split between ingress/egress. 5 GiB / sec (symmetric throughput) The larger limits offered on the HDD media tier in the provisioned v2 model mean that as your storage requirements grow, your file share can keep pace without the need to resort to unnatural workarounds such as sharding, allowing you to keep your data in logical file shares that make sense for your organization. Per share monitoring Since provisioning decisions are made on the file share level, in the provisioned v2 model, we’ve brought the granularity of monitoring down to the file share level. This is a significant improvement over pay-as-you-go file shares, which can only be monitored at the storage account level. To help you monitor the usage of storage, IOPS, and throughput against the provisioned limits of the file share, we’ve added the following new metrics: Transactions by Max IOPS, which provides the maximum IOPS used over the indicated time granularity. Bandwidth by Max MiB/sec, which provides the maximum throughput in MiB/sec used over the indicated time granularity. File Share Provisioned IOPS, which tracks the provisioned IOPS of the share on an hourly basis. File Share Provisioned Bandwidth MiB/s, which tracks the provisioned throughput of the share on an hourly basis. Burst Credits for IOPS, which helps you track your IOPS usage against bursting. To use the metrics, navigate to the specific file share in the Portal, and select “Monitoring > Metrics”. Select the metric you want, in this case, “Transactions by Max IOPS”, and ensure that the usage is filtered to the specific file share you want to examine. How to get access to the provisioned v2 billing model? The provisioned v2 model is generally available now, at the time of writing, in a limited set of regions. When you create a storage account in a region that has been enabled for provisioned v2, you can create a provisioned v2 account by selecting “Standard” for Performance, and “Provisioned v2” for File share billing. See how to create a file share for more information. When creating a share in a provisioned v2 storage account, you can specify the capacity and use the recommended performance. The recommendations we provide for IOPS and throughput are based on common usage patterns. If you know your workloads performance needs, you can manually set the IOPS and throughput to further tune your share. As you use your share, you may find that your usage pattern changes or that your usage is more or less active than your initial provisioning. You can always increase your storage, IOPS and throughput provisioning to right size for growth and you can also decrease any provisioned quantity after 24 hours have elapsed since your last increase. Storage, IOPS, and throughput changes are effective within a few minutes after a provisioning change. In addition to your baseline provisioned IOPS, we provide credit-based IOPS bursting that enables you to burst up to 3X the amount of provisioned IOPS for up to 1 hour, or as long as credits remain. To learn more about credit-based IOPS bursting, see provisioned v2 bursting. Pricing example To see the new provisioned v2 model in action, let’s compare the costs of the pay-as-you-go model versus the provisioned v2 model for the following Azure File Sync deployment: Storage: 50 used TiB For the pay as we go model, we need usage as expressed in the total number of “transaction buckets” for the month: Write: 3,214 List: 7,706 Read: 7,242 Other: 90 For the provisioned v2 model, we need usage as expressed as the maximum IOPS and throughput (in MiB / sec) hit over the course of an average time period to guide our provisioning decision: Maximum IOPS: 2,100 IOPS Maximum throughput: 85 MiB / sec To deploy a file share using the pay-as-you-go model, you need to pick an access tier to store the data in between transaction optimized, hot, and cool. The correct access tier to pick depends on the activity level of your data: a really active share should pick transaction optimized, while a comparatively inactive share should pick cool. Based on the activity level of this share as described above, cool is the best choice. When you deploy the share, you need to provision more than you use today to ensure the share can support your application as your data continues to grow. Ultimately this how much to provision is up to you, but a good rule of thumb is to start with 2X more than what you use today. There’s no need to keep your share at a consistent provisioned to used ratio. Now we have all the necessary inputs to compare cost: HDD pay-as-you-go cool (cool access tier) HDD provisioned v2 Cost components Used storage: 51,200 GiB * $0.015 / GiB = $768.00 Write TX: 3,214 * $0.1300 / bucket = $417.82 List TX: 7,706 * $0.0650 / bucket = $500.89 Read TX: 7,242 * $0.0130 / bucket = $94.15 Other TX: 90 * $0.0052 / bucket = $0.47 Provisioned storage: 51,200 used GiB * 2 * $0.0073 / GiB = $747.52 Provisioned IOPS: 2,100 IOPS * 2 * $0.402 / IO / sec = $168.84 Provisioned throughput: 85 MiB / sec * 2 * $0.0599 / MiB / sec = $10.18 Total cost $1,781.33 / month $926.54 / month Effective price per used GiB $0.0348 / used GiB $0.0181 / used GiB In this example, the pay-as-you-go file share costs $0.0348 / used GiB while the provisioned v2 file share costs $0.0181 / used GiB, a ~2X cost improvement for provisioned v2 over pay-as-you-go. Shares with different levels of activity will have different results – your mileage may vary. Typically, when deploying a file share for the first time, you would not know what the transaction usage would be, making cost projections for the pay-as-you-go model quite difficult. But it would still be straightforward to compute the provisioned v2 costs. If you don’t know specifically what your IOPS and throughput utilization would be, you can use the built-in recommendations as a starting point. Resources Here are some additional resources on how to get started: Azure Files pricing page Understanding the Azure Files provisioned v2 model | Microsoft Docs How to create an Azure file share | Microsoft Docs (follow the steps for creating a provisioned v2 storage account/file share)1.4KViews1like0CommentsAnnouncing the Public Preview of Metadata Caching for Azure Premium SMB File Shares
Azure Files is excited to announce the public preview of Metadata Caching for the premium SMB file share tier. Metadata Caching is an enhancement aimed at reducing metadata latency for file workloads running on Windows/Linux clients. In addition to lowering metadata latency, workloads will observe a consistent latency experience which will allow metadata intensive workloads to be more predictable and deterministic. Reduced metadata latency also translates to more data IOPS (reads/writes) and throughput. Once Metadata Caching is enabled, there is no additional cost or operational management overhead when using this feature. The following Metadata APIs will benefit from Metadata Caching. Create: Creating a new file; Up to 30% Faster Open: Opening a file; Up to 55% Faster Close: Closing a file; Up to 45% Faster Delete: Deleting a file; Up to 25% Faster Workloads that perform a high volume of metadata operations (creating/opening/closing/deleting) against a SMB Premium File share will receive the biggest benefit compared to workloads that are primarily data IO (e.g. databases) Example of metadata heavy workloads include: Web\App Services: Frequently accessed files for CMS\LMS services such as Moodle\WordPress. Indexing\Batch Jobs: Large scale processing using Azure Kubernetes or Azure Batch. Virtual Desktop Infrastructure: Azure Virtual Desktop\Citrix users with home directories or VDI applications with general purpose file system needs. Business Application: Custom line of business or legacy application with “Lift and shift” needs. CI\CD - DevOps Pipeline: Building, testing, and deployment workloads such as Jenkins open-source automation. Expected Performance Improvement with Metadata Cache. 2-3x Improved Metadata Latency Consistency Improved Metadata Latency beyond 30% Increased IOPS and Bandwidth up to 60% How to get started To begin onboarding to the Public Preview, please sign up on https://aka.ms/PremiumFilesMetadataCachingPreview and additional details will be provided. Regions Supported Australia East Brazil Southeast France South Germany West Central Switzerland North UAE North UAE Central US West Central (Updates will be provided as additional regions are supported and please sign up above to help influence the region prioritization) Who should Participate? Whether it is a new workload looking to leverage file shares or existing ones looking for improvements. Any workloads/usage patterns that contains metadata should be encouraged to onboard, specifically metadata heavy workloads that consist primarily of Create/Open/Close or Delete requests. To determine if your workload contains metadata, can use Azure Monitor to split the transactions by API dimension as described in the following article Thanks Azure Files Team4.9KViews2likes3CommentsAccelerate metadata heavy workloads with Metadata Caching preview for Azure Premium Files SMB & REST
Azure Files previously announced the limited preview of Metadata caching highlighting improvements on the metadata latency (up to 55%) for workloads running on Azure Premium Files using SMB & REST. Now, we are excited to announce the unlimited public preview lighting up this capability on both new and existing shares in a broader set of regions. You can now automatically onboard your subscriptions to leverage this functionality using feature registration (AFEC) in supported regions. Feature Overview Metadata Caching is an enhancement aimed at reducing metadata latency up to 55% for file workloads running on Windows/Linux environments. In addition to lower metadata latency, workloads will observe a 2-3x improvement in latency consistency making metadata intensive workloads more predictable and deterministic. Workloads that perform a high volume of metadata operations (e.g. AI/ML) will see the bigger benefit compared to workloads with high data IO (e.g. databases). Reduced metadata latency will also translate up to 3x increase in metadata scale, and up to 60% increase in data IOPS (reads/writes) and throughput. Example of metadata heavy workloads include: Web\App Services: Frequently accessed files for CMS\LMS services such as Moodle\WordPress. Indexing\Batch Jobs: Large scale processing using Azure Kubernetes or Azure Batch. Virtual Desktop Infrastructure: Azure Virtual Desktop\Citrix users with home directories or VDI applications management needs. Business Application: Custom line of business or legacy application with “Lift and shift” needs. CI\CD DevOps Pipeline: Building, testing, and deployment workloads such as Jenkins open-source automation Building DevOps solutions using Metadata Caching Moodle deployment + Azure Premium Files with Metadata Caching Moodle consists of server hosting (cloud platforms), a database (MySQL, PostgreSQL), file storage (Azure Premium Files), and a PHP-based web server. It is used for course management (uploading materials, assignments, quizzes), user interaction (students accessing resources, submitting work, and discussions), and performance monitoring (tracking progress, reporting). Metadata Cache Benefit: Provides a faster and more consistent user experience. GitHub Actions + Azure Premium Files with Metadata Caching GitHub Actions is an automation tool integrated with GitHub that allows developers to build, test, and deploy code directly from their repositories. It uses workflows, defined in YAML files, to automate tasks such as running tests, building software, or deploying applications. These workflows can be triggered by events like code pushes, pull requests, or scheduled times. Metadata Cache Benefit: Shorter build and deployment times when using Azure Premium Files with Metadata cache as the build artifact. How to get started To get started, register your subscription with the Metadata Cache feature using Azure portal or PowerShell. For Regional Availability please visit the following link Note: As we extend region support for the Metadata Cache feature, Premium File Storage Accounts in those regions will be automatically onboarded for all subscriptions registered with the Metadata Caching feature. Who should participate? Whether it is a new workload looking to leverage file shares or existing ones looking for improvements. Any workloads/usage patterns that contains metadata should be encouraged to onboard, specifically metadata heavy workloads that consist primarily of Create/Open/Close or Delete requests. To determine if your workload contains metadata, can use Azure Monitor to split the transactions by API dimension as described in the following article Thanks Azure Files Team For questions, please email: azfilespreview@microsoft.com2.2KViews0likes0CommentsAzure File share NFS Snapshots is now Public Preview!!
Azure Files is a fully managed cloud file share service that enables organizations to share data across on-premises and cloud. The service is truly cross-platform and supports mounting of file share from any client that implements SMB and NFS protocols, it also exposes REST APIs for programmability. A key part file share service offering is its integrated backup for point in time recovery, this enables recovery of data from certain periods in the past in case data is deleted or corrupted. Such capability is best offered by Snapshots. We are excited to announce public preview of Snapshot support for NFS share. Customers using NFS shares will now be able to perform share level Snapshot management operations via REST API, PowerShell and CLI. Using Snapshots users will be able to roll back the entire filesystems or pull specific files that were accidentally deleted or corrupted. Therefore, it is always recommended to create a snapshot schedule that best suits your RPO (recovery point objective) requirement. Snapshot schedule frequency can be hourly, daily, weekly or monthly. Having such flexibility will help IT infra teams to serve a wide spectrum of RPO requirements suiting business needs. Although there are multiple scenarios where snapshots can benefit users, I will be highlighting two important scenarios that are widely sought after. Scenario #1 Recover files in case of accidental deletions, corruption, or user errors. Scenario #2 Start-up read only replica of your application or database in few minutes to serve your reporting or analytics scenarios. Scenario #1 Recovery of data during accidental deletions and corruption is the most common scenario for admins during their day-to-day operations. There are solutions like backup (creating full and incremental copies of data) that help to recover the data from such scenarios, but snapshot technology offers more frequent recovery points (Lower RPO) to restore the data unlike backups. Snapshots are also considered to be space efficient since they capture only incremental changes. Creating snapshots of NFS file share is straightforward. This can be accomplished via Portal, REST, PowerShell or CLI. Let me show you how to access file share snapshots via NFS client to perform single file restore operations which can help you to recover data in accidental deletions or corruption scenarios. The first step is to mount the file share : cd “.snapshots” directory under root to view the snapshots that are already created. “.snapshot” directory is by default hidden but users will be able to access and read from the directory like a normal directory Each snapshot available/listed under .snapshot directory is a recovery point in itself. cd into the specific snapshot to view the files to be recovered. Initiate copy of required files and directories from snapshot to the desired location to complete the restore using cp command. Scenario #2 If you have an application or a database deployed on a NFS file share, one can create crash consistent or application consistent snapshot of NFS file share. Crash consistent is offered by default but Application consistent snapshots are not built-in capability, it will require admins to run few additional steps which can quiesce and unquiesce the application during snapshot creation process. For example, if you have a MySQL database one can write script to a execute a 3-step (quiesce(MySQL), snapshot(File share), and unquiesce(MySQL)) process to create an application-consistent snapshot of the database hosted on file share. Quiesce and unquiesce commands varies depending on the application or the database hosted on the file share. Such application consistent snapshots can be directly mounted on the desired NFS client and can be used as read only replicas for reporting and data analytics use-cases. The mounted snapshots can be used by applications or databases to create read-only static copies of the production database for analytics or reporting use-cases. They can be also copied to another location and then applications can be allowed to perform changes/writes. To improve copy performance, especially for large datasets with multiple files mount NFS snapshot using Nconnect setting which is available on latest Linux distributions and use fpsync to copy data out of snapshot to desired location. Sample scripts updated here For more information refer to documentation Mount an NFS Azure file share on Linux | Microsoft Learn Snapshot Share (FileREST API) - Azure Files | Microsoft Learn6.2KViews0likes3CommentsSoft delete for NFS Azure file shares is now Generally Available.
Soft delete protects your Azure file shares from accidental deletion. The following feature was already made available for SMB File share. Today, we are announcing the general availability of soft delete for NFS Azure file shares. The functionality will remain the same. Soft delete is like a recycle bin for your file shares. When a NFS file share is deleted, it transitions to a soft deleted state in the form of a soft deleted snapshot. You get to configure how long the soft deleted data is recoverable before it is permanently erased as part of the retention policy, by default its set to 7 days. Today, soft deleted NFS shares are not being counted towards the Storage account limit. We are providing a 30-day window (grace period) to update or change your automation scripts to account for soft deleted capacity when soft delete is enabled for NFS shares. By Sept 1st, 2024, we will roll out the change to start counting soft deleted capacity towards the account limit. What will change from September 1st 2024? soft deleted capacity will start counting towards storage account limits, this means the number of file shares that you can create on given storage account will be guarded by storage account limits when soft delete is enabled. Refer here for more information on supported capacity limits. Soft-deleted shares will be listed under deleted shares in the file share blade. To mount them or view their contents, you must undelete them. Upon undelete, the share will be recovered to its previous state, including all metadata as well as snapshots (Previous Versions). To successfully perform undelete ensure you do not have an active file share with the same name as that of deleted state. Soft delete feature is enabled by default on a storage account, the setting will apply to both NFS and SMB file shares. If you have an existing NFS file share in soft delete enabled account then, you would be enrolled for billing automatically. Soft delete enabled shares are billed on the used capacity when soft deleted at the snapshot rate while in the soft delete state. Billing would stop as soon the data is permanently deleted after the retention expires on the soft deleted state. Please refer here for Pricing and billing details.2.1KViews1like0Comments