Azure Cost Optimisation
Published Sep 25 2022 09:52 PM 11.4K Views
Microsoft

Table of Contents

 

 

Overview

At Microsoft, I have delivered about 50~ Cost Optimisation Assessments as part of WAF for customers and I wanted to share some of the common cost savings that I offer to my customers based on real world experience.

 

I have broken this down in the various components, storage, compute (IaaS), licensing, monitoring and PaaS.

 

As per https://azure.microsoft.com/en-us/solutions/cost-optimization/:

 

MarcKean_2-1665095417712.png

 

...I have covered these 7, plus more, right here, below.

 

Storage

General-purpose v2 storage accounts support the latest Azure Storage features and incorporate all of the functionality of general-purpose v1 and Blob storage accounts. General-purpose v2 accounts are recommended for most storage scenarios.

  • General-purpose v2 accounts deliver the lowest per-gigabyte capacity prices for Azure Storage, as well as industry-competitive transaction prices.
  • General-purpose v2 accounts support default account access tiers of hot or cool and blob level tiering between hot, cool, or archive.
  • General-purpose v2 accounts allows you to also use lifecycle management to optimize your storage cost

Best practice would be to upgrade to a general-purpose v2 storage account. There is no downtime or risk of data loss associated with upgrading to a general-purpose v2 storage ....

 

Azure Blob Storage lifecycle management

Azure Blob Storage lifecycle management offers a rich, rule-based policy for GPv2 and blob storage accounts. Use the policy to transition your data to the appropriate access tiers or expire at the end of the data's lifecycle.

 

The lifecycle management policy lets you:

  • Transition blobs from cool to hot immediately if accessed to optimize for performance
  • Transition blobs, blob versions, and blob snapshots to a cooler storage tier (hot to cool, hot to archive, or cool to archive) if not accessed or modified for a period of time to optimize for cost
  • Delete blobs, blob versions, and blob snapshots at the end of their lifecycles
  • Define rules to be run once per day at the storage account level
  • Apply rules to containers or a subset of blobs (using name prefixes or blob index tags as filters)

 

More details: https://docs.microsoft.com/en-us/azure/storage/blobs/storage-lifecycle-management-concepts

 

Orphaned Managed Disks

I constantly see customers with so many managed disks which are unattached and orphaned. Recommendation here would be to delete these if you know you can. Else (from a VM within Azure in the same region where the disks are (to save on egress costs)) use Azure Storage Explorer, download the managed disks as VHD disks, then copy to an Azure Storage account and mark the storage account as Archive (tape storage backend).

 

Archive storage is estimated less than 10% the cost of managed disk storage. Note, VHDs can be brought back and imported again as managed disks at any time if they are needed.

 

Pricing can be confirmed by using the Azure Pricing Calculator

 

Geo-Redundant Storage Accounts

While GRS offers an extra layer of protection over data, it comes at a cost with data transfer charges to; and duplicate storage in; a secondary region.

 

If you want to add or remove geo-replication or read access to the secondary region, you can use the Azure portal, PowerShell, or Azure CLI to update the replication setting in some scenarios. https://docs.microsoft.com/en-us/azure/storage/common/redundancy-migration?tabs=portal#switch-betwee...

 

Using cost analysis, we can see the cost of GRS replication, filter based on:

  • Service Name: storage
  • Service tier: storage – bandwidth

 

Recovery Services Vault

I had a customer where on one of their GRS vaults, the cost of data replication to the secondary region was about $10K, whereas the storage for the backup of the compute workloads themselves was less than $1K.

 

As per https://azure.microsoft.com/en-us/pricing/details/backup/ GRS backup storage is roughly 2.5 times more expensive than LRS backup storage.

 

There is also Archive tier support for Azure backup. Azure Backup supports backup of long-term retention points in the archive tier, in addition to snapshots and the Standard tier. See https://docs.microsoft.com/en-us/azure/backup/archive-tier-support

 

Supported workloads for the Archive tier:

  • Azure virtual machines
    • Only monthly and yearly recovery points. Daily and weekly recovery points aren't supported.
    • Age >= 3 months in Vault-Standard Tier
    • Retention left >= 6 months
    • No active daily and weekly dependencies
  • SQL Server in Azure virtual machines
    • Only full recovery points. Logs and differentials aren't supported.
    • Age >= 45 days in Vault-Standard Tier
    • Retention left >= 6 months
    • No dependencies 

 

Recovery Services Vault & Backup Vaults

There are two types of vaults in Azure:

 

Backup vault is a storage entity in Azure that houses backup data for certain newer workloads that Azure Backup supports. https://docs.microsoft.com/en-us/azure/backup/backup-vault-overview

  • storage redundancy, see these articles on Geo-redundant storage, Zone-redundant storage, and local redundancy

Recovery Services vault is a storage entity in Azure that houses data. The data is typically copies of data, or configuration information for virtual machines (VMs), workloads, servers, or workstations https://docs.microsoft.com/en-us/azure/backup/backup-azure-recovery-services-vault-overview

  • Cross Region Restore: Cross Region Restore (CRR) allows you to restore Azure VMs in a secondary region, which is an Azure paired region. By enabling this feature at the vault level, you can restore the replicated data in the secondary region any time, when you choose. This enables you to restore the secondary region data for audit-compliance, and during outage scenarios, without waiting for Azure to declare a disaster (unlike the GRS settings of the vault)

 

MarcKean_3-1665095462766.png

 

How to change from GRS to LRS after configuring backup

I see this with many customers, they have GRS turned on for their vaults, intentionally or not, most of the time, they don't even know this and it's costing a bomb to replicated data to another region even if their BC & DR doesn't stipulate it. Before deciding to move from GRS to locally redundant storage (LRS), review the trade-offs between lower cost and higher data durability that fit your scenario. If you must move from GRS to LRS, then you have two choices. They depend on your business requirements to retain the backup data:

 

Details: https://docs.microsoft.com/en-us/azure/backup/backup-create-rs-vault#how-to-change-from-grs-to-lrs-a...

 

Compute (IaaS)

Azure Spot Virtual Machines for virtual machine scale sets

Using Azure Spot Virtual Machines on scale sets allows you to take advantage of our unused capacity at a significant cost savings. At any point in time when Azure needs the capacity back, the Azure infrastructure will evict Azure Spot Virtual Machine instances. Therefore, Azure Spot Virtual Machine instances are great for workloads that can handle interruptions like batch processing jobs, dev/test environments, large compute workloads, and more.

 

Create a scale set that uses Azure Spot Virtual Machines - Azure Virtual Machine Scale Sets | Micros...

 

Instance size flexibility

With a reserved virtual machine instance that's optimized for instance size flexibility, the reservation you buy can apply to the virtual machines (VMs) sizes in the same instance size flexibility group. For example, if you buy a reservation for a VM size that's listed in the DSv2 Series, like Standard_DS3_v2, the reservation discount can apply to the other sizes that are listed in that same instance size flexibility group:

  • Standard_DS1_v2
  • Standard_DS2_v2
  • Standard_DS3_v2
  • Standard_DS4_v2

But that reservation discount doesn't apply to VMs sizes that are listed in different instance size flexibility groups, like SKUs in DSv2 Series High Memory: Standard_DS11_v2, Standard_DS12_v2, and so on. More details here: https://docs.microsoft.com/en-us/azure/virtual-machines/reserved-vm-instance-size-flexibility

 

Right-size or shutdown underutilized virtual machines

Running VMs in the cloud that are not sized correctly and/or not running at their full potential only ends up costing money for non-utilized compute – essentially, just waste. Likewise, VMs that are running and not being used, best option is to keep VMs shutdown when not in use to stop being charged.

VMs running on older sizes

Rule of thumb in Azure, using the newest SKUs for any Azure resource, including virtual machines is generally cheaper. E.g. moving from a v3 VM to a v5 VM can save you money and sometimes with an uptick of CPU/RAM.

 

Check the virtual machine pricing page for a comparison of VM prices based on current sizes of VMs - https://azureprice.net/

 

Scale sets, Spot Priority Mix

A feature for Virtual Machine Scale Sets called Spot Priority Mix for high availability and cost savings will enable Azure customers to mix Standard and Spot Virtual Machines (VMs) in the same virtual machine scale set. At the time of writing, this is in preview. This new capability is available with flexible orchestration mode and can help Azure customers achieve significant cost savings* given the deep discount rates that Spot VMs usually provide.

 

Virtual Machines Scale Sets’ flexible orchestration mode provides Azure customers with the ability to deploy highly available large-scale cloud infrastructure quickly, reliably and easily with identical or multiple virtual machine types.

 

Customers can also set up policies that define the percentage allocation of Standard VMs versus Spot VMs. The number of Standard VMs that need to be running at any given time, in addition to the percentage of Spot VMs, can also be defined.

 

Learn more:  https://learn.microsoft.com/en-us/azure/virtual-machine-scale-sets/spot-priority-mix

 

Azure savings plans for compute

Azure savings plans save you money when you have consistent usage of Azure compute resources. The savings can significantly reduce your resource costs by up to 66% from pay-as-you-go prices. Discount rates per meter vary by commitment term (1-year or 3-year), not commitment amount.

 

Available for Virtual Machines, Azure Dedicated Hosts, Azure Container Instances, Azure Premium Functions and Azure App Service Plans. Possibilities are more than this... When we talk about Virtual Machines, Azure has VMSS (Virtual Machine Scale Sets), also AKS node pools which uses VMSS and Azure Virtual Desktop session hosts also use Virtual Machines as well.

 

You can trade in Azure reservations for savings plans, but you can’t exchange a savings plan for an Azure reservation. Savings plans can’t be cancel or refunded.

 

You can use scope and apply savings plans to a shared billing scope, management group, subscription or individual resource group. 

 

More details: https://learn.microsoft.com/en-us/azure/cost-management-billing/savings-plan/savings-plan-compute-ov...

A video from the Azure Enablement Show: https://learn.microsoft.com/en-us/shows/azure-enablement/introduction-to-azure-savings-plan-for-comp... 

 

License costs

MarcKean_4-1665095491106.png

 

Significantly reduce costs - up to 72 percent compared to pay-as-you-go prices with one-year or three-year terms on Windows and Linux virtual machines (VMs).

 

When you combine the cost savings gained from Azure RIs with the added value of the Azure Hybrid Benefit, you can save up to 80 percent.

 

Use Hybrid Benefit

Save over the standard pay-as-you-go rate by bringing your Windows Server and SQL Server on-premises licenses to Azure: https://azure.microsoft.com/en-us/pricing/hybrid-benefit/ and https://docs.microsoft.com/en-us/azure/azure-sql/azure-hybrid-benefit

 

Azure Hybrid Benefit for AKS and Azure Stack HCI

At Ignite 2022, it was announced that we are expanding Azure Hybrid Benefit to further reduce costs for on-premises and edge locations. Customers with Windows Server Software Assurance (SA) can use Azure Hybrid Benefit for Azure Kubernetes Service (AKS) and Azure Stack HCI.

 

  • Azure Stack HCI is a hyperconverged infrastructure (HCI) cluster solution that hosts virtualized Windows and Linux workloads and their storage in a hybrid environment

  • Azure Stack HCI is intended as a virtualization host

  • Azure Stack HCI is delivered as an Azure service

  • AKS hybrid deployment options simplifies managing, deploying and maintaining a Kubernetes cluster on-premises

  • Run AKS on Windows Server and Azure Stack HCI at no additional cost in datacenter and edge locations. With this, you can deploy and manage containerized Linux and Windows applications from cloud to edge with a consistent, managed Kubernetes service. This applies to Windows Server Datacenter and Standard Software Assurance and Cloud Solution Provider (CSP) customers.

More details: https://azure.microsoft.com/en-us/updates/generally-available-azure-hybrid-benefit-for-aks-and-azure...

 

Flexible Virtualization Benefit

This allows customers with Software Assurance or subscription licenses to run their own licensed software, including Windows Server, on other cloud providers’ infrastructure—dedicated or multitenant (except for listed providers). Additionally, customers can also license Windows Server on a VM basis.

 

Details here: https://cloudblogs.microsoft.com/windowsserver/2022/10/12/maximize-your-windows-server-investments-w...

 

Windows Container Base Image Redistribution Rights Change

In the past, customers could distribute Windows Container base images within their own organization, but they were not allowed to redistribute outside their own organisation. Starting October 2022, customers will be able to redistribute Windows Container base images beyond their organisation in accordance with the updated End-User Agreement License.

 

Dev/Test benefits

Paying higher rates for non-prod workloads? Azure Dev/Test subscription offers offer special lower Dev/Test rates on Windows Virtual Machines, Cloud Services, SQL Database, SQL Managed Instance, HDInsight, App Service (Basic, Standard, Premium v2, Premium v3) and Logic Apps. As per https://azure.microsoft.com/en-us/offers/ms-azr-0148p/

 

The Enterprise Dev/Test offer is restricted to dev/test usage only, and only by active Visual Studio subscribers.

 

Note: don’t use HUB for non-production workloads, as this is where Azure Dev/Test subscriptions are to be used for to realise the same cost savings.

 

Reservations

Scope reservations

You can scope a reservation to a subscription or resource groups. Setting the scope for a reservation selects where the reservation savings apply. When you scope the reservation to a resource group, reservation discounts apply only to the resource group—not the entire subscription.

 

Reservation scoping options

You have three options to scope a reservation, depending on your needs:

 

  • Single resource group scope - Applies the reservation discount to the matching resources in the selected resource group only.
  • Single subscription scope - Applies the reservation discount to the matching resources in the selected subscription.
  • Shared scope - Applies the reservation discount to matching resources in eligible subscriptions that are in the billing context. If a subscription was moved to different billing context, the benefit will no longer be applied to this subscription and will continue to apply to other subscriptions in the billing context.
    • For Enterprise Agreement customers, the billing context is the enrollment. The reservation shared scope would include multiple Active Directory tenants in an enrollment.
    • For Microsoft Customer Agreement customers, the billing scope is the billing profile.
    • For individual subscriptions with pay-as-you-go rates, the billing scope is all eligible subscriptions created by the account administrator.

 

More details: https://docs.microsoft.com/en-us/azure/cost-management-billing/reservations/prepare-buy-reservation

 

There are many Azure resources that you can purchase on reservations to save money, the full list here, however here's a few below:

 

Virtual machine reserved instances

Purchase one-year or three-year term Azure Reserved VM Instances directly in the Azure portal, and pay with a single, upfront payment or on a monthly basis. The monthly payment option is available at no extra cost. 

 

More details: https://learn.microsoft.com/en-au/azure/virtual-machines/prepay-reserved-vm-instances 

 

Managed Disks reservations

Azure Disk Storage reservations are available only for select Azure premium SSD SKUs (P30 (1 TiB) premium managed disks and above). The SKU of a premium SSD determines the disk's size and performance. A disk reservation is made per disk SKU. As a result, the reservation consumption is based on the unit of the disk SKUs instead of the provided size. Make sure you track the usage in disk SKUs instead of provisioned or used disk capacity. https://docs.microsoft.com/azure/virtual-machines/disks-reserved-capacity

 

Azure Storage reservations

For Azure storage, when you purchase an Azure Storage reservation, you must choose the region, access tier, and redundancy option for the reservation. Your reservation is valid only for data stored in that region, access tier, and redundancy level. Reservations are available today for 100 TiB or 1 PiB blocks, with higher discounts for 1 PiB blocks. Understand how reservation discounts are applied to Azure storage services and Optimize costs for Blob storage with reserved capacity

 

Log Analytics Commitment Tiers

Starting June 2, 2021, Capacity Reservations are now called Commitment Tiers

 

In addition to the Pay-As-You-Go model, Log Analytics has Commitment Tiers, which can save you as much as 30 percent compared to the Pay-As-You-Go price. With the commitment tier pricing, you can commit to buy data ingestion starting at 100 GB/day at a lower price than Pay-As-You-Go pricing. Any usage above the commitment level (overage) is billed at that same price per GB as provided by the current commitment tier. The commitment tiers have a 31-day commitment period. During the commitment period, you can change to a higher commitment tier (which restarts the 31-day commitment period), but you can't move back to Pay-As-You-Go or to a lower commitment tier until after you finish the commitment period. Billing for the commitment tiers is done on a daily basis. Learn more about Log Analytics Pay-As-You-Go and Commitment Tier pricing

 

https://docs.microsoft.com/en-us/azure/azure-monitor/logs/manage-cost-storage

 

Exchanges & refunds

This always slips up customers, not understanding this policy. As documented here, you have options for both exchanges and refunds of reserved instances and it's very generous and extremely flexible. 

 

At the time of writing, note:

 

MarcKean_5-1665095525582.png

 

  • You can also refund reservations, but the sum total of all canceled reservation commitment in your billing scope (such as EA, Microsoft Customer Agreement, and Microsoft Partner Agreement) can't exceed USD 50,000 in a 12 month rolling window.
  • An exchange is processed as a refund and a repurchase
  • For a limited time you may trade-in your Azure reserved instances for compute for a savings plan

Also note: Exchanges will be unavailable for Azure reserved instances for compute services purchased on or after January 1, 2024.

Monitoring

As per our CAF (https://docs.microsoft.com/en-us/azure/cloud-adoption-framework/ready/enterprise-scale/management-an...) Recommendation is to use a single monitor logs workspace to manage platforms centrally except where Azure role-based access control (Azure RBAC), data sovereignty requirements and data retention policies mandate separate workspaces. Centralized logging is critical to the visibility required by operations management teams. Logging centralization drives reports about change management, service health, configuration, and most other aspects of IT operations. Converging on a centralized workspace model reduces administrative effort and the chances for gaps in observability.

 

Commitment Tiers

In addition to the Pay-As-You-Go model, Log Analytics has Commitment Tiers, which can save you as much as 30 percent compared to the Pay-As-You-Go price. With commitment tier pricing, you can commit to buy data ingestion for a workspace, starting at 100 GB/day, at a lower price than Pay-As-You-Go pricing. Any usage above the commitment level (overage) is billed at that same price per GB as provided by the current commitment tier. The commitment tiers have a 31-day commitment period from the time a commitment tier is selected.

 

More details: https://learn.microsoft.com/en-us/azure/azure-monitor/logs/cost-logs#commitment-tiers 

 

Azure Monitor basic logs and data archive

With Azure Monitor basic logs and data archive, users can ingest logs at a fifth of current ingestion costs and archive them for up to seven years.

 

By default, all tables in a workspace are Analytics tables, which are available to all features of Azure Monitor and any other services that use the workspace. You can configure certain tables as Basic Logs to reduce the cost of storing high-volume verbose logs you use for debugging, troubleshooting, and auditing, but not for analytics and alerts. Tables configured for Basic Logs have a lower ingestion cost in exchange for reduced features. Query limitations are listed here.

 

Basic Logs tables retain data for eight days. When you change an existing table's plan to Basic Logs, Azure archives data that's more than eight days old but still within the table's original retention period.

 

PaaS

App Service Plans

The common thing that I see a lot with customers is that they don't really make use of auto-scale and consolidate app services onto app service plans efficiently. Customers seem to have app service plan sprawl with one app service plan to one app service.  You can potentially save money by putting multiple apps into one App Service plan. You can continue to add apps to an existing plan as long as the plan has enough resources to handle the load.

 

Availability Zone support for public multi-tenant App Service

Microsoft Azure App Service can be deployed into Availability Zones (AZ) to help you achieve resiliency and reliability for your business-critical workloads. This architecture is also known as zone redundancy.

 

An app lives in an App Service plan (ASP), and the App Service plan exists in a single scale unit. When an App Service is configured to be zone redundant, the platform automatically spreads the VM instances in the App Service plan across all three zones in the selected region. If a capacity larger than three is specified and the number of instances is divisible by three, the instances will be spread evenly. Otherwise, instance counts beyond 3*N will get spread across the remaining one or two zones. For App Services that aren't configured to be zone redundant, the VM instances are placed in a single zone in the selected region. https://docs.microsoft.com/en-us/azure/app-service/how-to-zone-redundancy

 

There are prerequisites for availability zones as per https://docs.microsoft.com/en-us/azure/app-service/how-to-zone-redundancy#requirements:. Here's some listed below with the full list in the link immediately above. 

  • Requires either Premium v2 or Premium v3 App Service plans
  • Minimum instance count of three

 

App Service Environments

Some details below on app service environments:

 

MarcKean_6-1665096409111.pngMarcKean_7-1665096422912.png

 

The below are estimates only in AUD from August 2022:

 

MarcKean_0-1665521856156.png

 

 

Azure Advisor

A free tool which you can always refer to in your Azure environment is Azure Advisor. Azure Advisor follows the WAF framework, including cost, and has with it many recommendations for you in order to get the most value as possible with your Azure subscriptions.

 

More details here: https://learn.microsoft.com/en-us/azure/advisor/advisor-reference-cost-recommendations 

 

3 Comments
Co-Authors
Version history
Last update:
‎Oct 24 2022 02:45 PM
Updated by: