Blog Post

Azure Compute Blog
4 MIN READ

Streamline Cloud Spend with Azure Reserved VM Instances

kyleikeda's avatar
kyleikeda
Icon for Microsoft rankMicrosoft
Oct 29, 2025

Cloud costs can feel like a runaway train, especially if you’re running GPU-heavy workloads for AI. Every hour of compute adds up, and before long, your forecasted budget is looking off.

This is where Azure Reserved Virtual Machine Instance can help. When customers commit to using a specific Virtual Machine in a region for a set time they’ll receive a discounted price.

In this blog we’ll follow a fictious company, Contoso, that is looking to optimize their VM spend while scaling their AI workloads. Contoso were training generative models and running inference on NC64as T4 v3 VMs in East US. These VMs are built for GPU acceleration, making them ideal for AI workloads. They used these VMs for deep learning inference, model fine-tuning, and batch processing at scale.

Performance was non-negotiable, but cost predictability was slipping away.

So how did they turn things around? With Azure Reserved VM Instances and leveraging Azure’s built-in tools.

What are Azure Reserved VM Instances?

Azure Reserved VM Instances (RIs) are an Azure commitment offer that provides a discount when you commit to Azure usage for 1 or 3 years. You pick a VM instance, region, and term, and Azure automatically applies the discount to any matching VM you run. No manual assignment, no runtime changes, just lower compute costs.

Why does this matter? Because if your workloads are predictable and stable, RIs can save you up to 72% compared to pay-as-you-go pricing for Windows and Linux virtual machines. For Contoso, that meant turning unpredictable GPU costs into a predictable, CFO-approved budget.

 

The challenge: “How do we start?”

Contoso had questions:

  • What if we resize VMs during model experiments?
  • What if usage shifts across subscriptions as teams grow?
  • How do we avoid overbuying and wasting money?

That’s where Azure’s ecosystem comes in:

Step 1: Let Azure Advisor do the math

Instead of guessing, Contoso opened Azure Advisor. Within seconds, they saw:

  • Which VM family (NCasT4_v3) was most consistent based on historical usage.
  • How many RIs to purchase.
  • Whether a 1-year or 3-year term made sense.

Azure Advisor even flagged idle VMs they could shut down before buying to avoid purchasing for resources that are not being used.

Step 2: Choosing the right scope

Here’s where Contoso faced a critical choice: scope. Scope determines where the RI discount applies:

  • Shared scope: Applies across all subscriptions in the same billing context.
  • Management group: Covers subscriptions grouped under a management hierarchy.
  • Single subscription: Simple, but limited to one subscription.
  • Resource group: Most accurate for chargeback costs.

Contoso had multiple teams running AI workloads in different subscriptions but under the same billing account. They wanted maximum utilization without losing visibility for chargeback. After reviewing governance policies, they chose Shared scope as it gave them broad coverage while keeping reporting simple in Cost Management.

Step 3: Buying RIs the smart way

Based on their Azure Advisor recommendations Contoso purchased:

  • NC64as T4 v3 RIs in East US for inference clusters (3-year term, upfront for maximum savings).

They scoped the reservations to Shared so discounts applied across multiple subscriptions. And they enabled instance size flexibility, meaning their NC64as T4 v3 RI could also cover smaller sizes in the NCasT4_v3 family if they scaled down during experiments. No lock-in panic.

Step 4: Monitoring like a pro

After purchase, Contoso didn’t just walk away. They set up:

  • Utilization alerts in Microsoft Cost Management (triggered if usage dipped below a predetermine threshold).
  • Auto-renewal to avoid unexpected spikes in cost when their RIs expired.

When they added new inference nodes for a production rollout, the RI discount applied automatically. When they resized a few training VMs for smaller models, instance size flexibility kept the savings intact.

The result? Big savings, zero compromise

Contoso cut GPU compute costs compared to pay-as-you-go. They kept performance rock solid, freed up budget for more experiments, and gave their CFO the predictability they craved.

What Contoso learned

  • Start with Advisor: It’s like having a FinOps analyst on demand.
  • Scope smartly: Shared scope boosts utilization, but align with your chargeback model.
  • Flexibility matters: Enable instance size flexibility for future-proofing.
  • Monitor relentlessly: Alerts and Cost Management dashboards keep you ahead of surprises.

Combine with Azure savings plan for compute if your workloads are less predictable and need additional flexibility

Why this matters for you

If you’re running AI workloads, or any predictable compute, Reserved Instances aren’t just a cost-saving trick. They’re a strategy for financial predictability, operational flexibility, and cloud efficiency.

Ready to start?
Visit the Azure portal to purchase your Reserved Instance today or read the Azure Reserved Instance documentation to learn more.

 Resources:

Updated Oct 29, 2025
Version 1.0