Cost Optimization Practices for Azure VMs – VM services
Published May 10 2023 10:00 AM 7,421 Views
Microsoft

Azure Virtual Machines are an excellent solution for hosting both new and legacy applications. However, as your services and workloads become more complex and demand increases, your costs may also rise. Azure provides a range of pricing models, services, and tools that can help you optimize the allocation of your cloud budget and get the most value for your money.

 

Let’s explore Azure’s various cost-optimization options to see how they can significantly reduce your Azure compute costs.

The major Azure cost optimization options can be grouped into three categories: VM services, pricing models and programs, and cost analysis tools. 

 

Let’s have a quick overview of these 3 categories:

 

VM services – Several VM services give you various options to save, depending on the nature of your workloads.  These can include things like dynamically autoscaling VMs according to demand or utilizing spare Azure capacity at up to 90% discount versus pay-as-you-go rates.

 

Pricing models and programs – Azure also offers various pricing models and programs that you can take advantage of depending on your needs and desires of how you plan to spend your Azure costs.  For example, committing to purchase compute capacity for a certain time period can lower your average costs per VM by up to 72%.

 

Cost analysis tools – This category of optimization features various tools available for you to calculate, track, and monitor costs of your Azure spend.  This deep insight and data into your spending allows you to make better decisions about where your compute costs are being spent and how to allocate them in a way that best suits your needs.

 

When it comes to VMs, the various VMs services are probably the first place you want to start when looking to save cost.  While this blog will focus mostly on VM services, stay tuned for blogs about pricing models & programs and cost analysis tools!

 

Spot Virtual Machines

 

Spot Virtual Machines provide compute capacity at drastically reduced costs by leveraging compute capacity that isn’t being currently used.  While it’s possible to have your workloads evicted, this compute capacity is charged at a greatly reduced price, up to 90%.  This makes Spot Virtual Machines ideal for workloads that are interruptible and non-time sensitive, like machine learning model training, financial modeling, or CI/CD.

 

Incorporating Spot VMs can undoubtedly play a key role in your cost savings strategy. Azure provides significant pricing incentives to utilize any current spare capacity.  The opportunity to leverage Spot VMs should be evaluated for every appropriate workload to maximize cost savings.  Let’s learn more about how Spot Virtual Machines work and if they are right for you.

 

Deployment Scenarios

There are a variety of cases in which Spot VMs can be ideal for, let’s look at some examples:

 

  • CI/CD – CI/CD is one of the easiest places to get started with Spot Virtual Machines. The temporary nature of many development and test environments makes them suited for Spot VMs.  The difference in time of a couple minutes to a couple hours when testing an application is often not business-critical.  Thus, deploying CI/CD workloads and build environments with Spot VMs can drastically lower the cost of operating your CI/CD pipeline. Customer story
  • Financial modeling – creating financial models is also compute resource intensive, but often transient in nature.  Researchers often struggle to test all the hypotheses they want with non-flexible infrastructure.  But with Spot VMs, they add extra compute resources during periods of high demand without having to commit to purchasing a higher amount of dedicated VM resources, creating more and better models faster. Customer story
  • Media rendering – media rendering jobs like video encoding and 3D modeling can require lots of computing resources but may not necessarily demand resources consistently throughout the day.  These workloads are also often computationally similar, not dependent on each other, and not requiring immediate responses.  These attributes make it another ideal case for Spot VMs. For rendering infrastructure often at capacity, Spot VMs are also a great way to add extra compute resources during periods of high demand without having to commit to purchasing a higher amount of dedicated VM resources to meet capacity, lowering overall TCO of running a render farm. Customer story

 

Generally speaking, if the workload is stateless, scalable, or time, location, and hardware-flexible, then they may be a good fit for Spot VMs.  While Spot VMs can offer significant cost savings, they are not suitable for all workloads. Workloads that require high availability, consistent performance, or long-running tasks may not be a good fit for Spot VMs. 

 

Features & Considerations

Now that you have learned more about Spot VMs and may be considering using them for your workloads, let’s talk a bit more about how Spot VMs work and the controls available to you to optimize cost savings even further.

 

Spot VMs are priced according to demand.  With this flexible pricing model, Spot VMs also give you the ability to set a price limit for the Spot VMs that you’ll use.  If the demand is high enough that the price for a Spot VM exceeds what you’re willing to pay, you can simply use this limit to opt to not run your workloads at that time and wait for demand to decrease.  If you anticipate the Spot VMs you want to use are in a region that will have high utilization rates a time of day or month, you may want to choose another region, or plan for creating higher price limits for workloads that occur during higher demand times.  If the time when the workload runs isn’t important, you may opt to set the price limit low, such that your workloads only run during periods that Spot capacity is the cheapest to minimize your Spot VM costs.

 

While using Spot VMs with price limits, we also must look at the different eviction types and policies, which are options you can set in place to determine what happens to your Spot VMs when they are to be reclaimed by a pay-as-you-go customer.   To maximize cost savings, it’s best to prioritize the delete eviction policy first.  VMs can be redeployed faster, meaning less downtime waiting for Spot capacity, and not having to pay for disk storage.  However, if your workload is region or size specific, and requires some level of persistent data in the event of an eviction, then the Deallocate policy will be a better option. 

 

These things may only be a small slice of all the considerations to best utilize Spot VMs.  Learn more about best practices for building apps with Spot VMs here.

 

So how can we actually deploy and manage Spot VMs at scale? Using Virtual Machine Scale Sets is likely your best option. Virtual Machine Scale Sets, in addition to Spot VMs, offer a plethora of cost savings features and options for your VM deployments and easily allow you to deploy your Spot VMs in conjunction with standard VMs.  In our next section, we’ll look at some of these features in Virtual Machine Scale Sets and how we can use them to deploy Spot VMs at scale.

 

Virtual Machine Scale Sets

 

Virtual Machine Scale Sets enable you to manage and deploy groups of VMs at scale with a variety of load balancing, resource autoscaling, and resiliency features.  While a variety of these features can indirectly save costs like making deployments simpler to manage or easier to achieve high availability, some of these features contribute directly to reducing costs, namely autoscaling and Spot Mix.  Let’s dive deeper into how these two features can optimize costs.

 

Autoscaling

Autoscaling is a critical feature included within Virtual Machine Scale Sets that give you the ability to dynamically increase or decrease the number of virtual machines running within the scale set. This allows you to scale out your infrastructure to meet demand when it is required, and scale it in when compute demand lowers, reducing the likelihood that you’ll be paying to have extra VMs running when you don’t have to.

 

VMs can be autoscaled according to rules that you can define yourself from a variety of metrics.  These rules can be based off host-based metrics available from your VM like CPU usage or memory demand or application-level metrics like session counts and page load performance.  This flexibility gives you the option to scale in or out your workload to very specific requirements, and it is with this specificity that you can control your infrastructure scaling to optimally meet your compute demand without extra overhead.

You can also scale in or out according to a schedule, for cases in which you can anticipate cyclical changes to VM demand throughout certain times of the day, month, or year.  For example, you can automatically scale out your workload at the beginning of the workday when application usage increases, and then scale in the number of VM instances to minimize resource costs overnight when application usage lowers.  It’s also possible to scale out on certain days when events occur such as a holiday sale or marketing launch.  Additionally, for more complex workloads, Virtual Machines Scale Sets also provides the option to leverage machine learning to predictively autoscale workloads according to historical CPU usage patterns. 

 

These autoscaling policies make it easy to adapt your infrastructure usage to many variables and leveraging autoscale rules to best fit your application demand will be critical to reducing cost.

 

Spot Mix

With Spot Mix in Virtual Machine Scale Sets, you can configure your scale in or scale out policy to specify a ratio of standard to Spot VMs to maintain as VMs increase or decrease.  Say if you specify a ratio of 50%, then for every 10 new VMs the scale out policy adds to the scale set, 5 of the machines will be standard VMs, while the other 5 will be Spot.  To maximize cost savings, you may want to have a low ratio standard to Spot VMs, meaning more Spot VMs will be deployed instead of standard VMs as the scale set grows.  This can work well for workloads that don’t need much guaranteed capacity at larger scales.  However, for workloads that need greater resiliency at scale, then you may want to increase the ratio to ensure adequate baseline standard capacity.

 

You can learn more about choosing which VM families and sizes might be right for you with the VM selector and the Spot Advisor, which we will cover more in depth a later blog of this VM cost optimization blog series. 

 

Wrapping up

 

We’ve learned how Spot VMs and Virtual Machines Scale Sets, especially when combined, equip you with various features and options to control how your VMs behave and how you can use those controls in a manner to maximize your cost savings. 

Next time, we’ll go in depth the various pricing models and programs available in Azure that can even further optimize your cost, allowing you to do more with less with Azure VMs.  Stay tuned for more blogs!

Co-Authors
Version history
Last update:
‎May 15 2023 03:23 PM
Updated by: