Blog Post

Azure Infrastructure Blog
14 MIN READ

Demystifying On-Demand Capacity Reservations

KenHooverMSFT's avatar
KenHooverMSFT
Icon for Microsoft rankMicrosoft
Apr 03, 2026

Azure's 80+ regions around the globe are provisioned with the necessary hardware to run the platform and its services.  As a practical matter, demand for a particular offering can sometimes grow beyond the capability of a region's infrastructure to actually bring it online for our customers.  When this happens, the request will return an error indicating that insufficient capacity was available to create or start that service. This article is a detailed exploration of an Azure resource that helps ensure that important virtual machines in your environment will be running when you need them.

About On-Demand Capacity Reservations

Introducing the “parking garage” metaphor

There are dozens of VM types available in Azure which span multiple generations of CPU across vendors and architectures.  Within each Azure region are datacenters hosting pools of hardware which runs Azure services, such as virtual machines, of those types.  As VMs are started and stopped by customers there is a constant ebb and flow of available capacity to run each type of VM within the region.  Available capacity is driven by the rhythms of the business day, which creates variations in utilization on an hour-to-hour and even minute-to-minute basis.  Longer cycles of demand such as holiday seasons, school calendars and other real-world events are also a factor. 

When you command an Azure Virtual Machine (VM) to start, the Azure Resource Manager (ARM) – the “engine” that manages resources in the Microsoft cloud -- needs to do a few things to make it happen.  The most important of these is that it needs to identify hardware within the target region with sufficient capacity to bring the desired type and size of VM online at that moment in time.  If ARM finds space for the desired VM size, the VM starts normally.  However, if there is no room to start the desired VM, you will see an error similar to this one:

 

 

This process of finding a place to start up an Azure VM has a lot of similarities to finding a place to park a vehicle.  Parking facilities are built to handle typical demand for their location.  If something is going on nearby, such as a large sporting event, which causes the need for parking to be much higher than normal then you might be out of luck when you try to find a spot because the garage is simply full.

During periods of high demand in Azure this can result in VMs failing to start simply because there is nowhere to run them at that particular moment.  If this happens to a VM which needed to be stopped for a configuration change or other reasons this can cause impact to your environment which you certainly want to avoid.

On-Demand Capacity Reservations

Azure has a resource called an On-Demand Capacity Reservation, or ODCR, which allows you to reserve a spot for a VM in the appropriate hardware within a region for a specific VM size.  This is similar to “owning" a parking space: It’s a reserved place exclusively for the use of a specific VM. 

At a high level, the way this works is that you create an ODCR which matches the Azure region, availability zone and specific VM type, such as for a VM of type D16s_v6 in availability zone 2 of the Canada Central Azure region.  Once the reservation is created, an Azure VM that matches that configuration can be associated to it so the VM now “owns” that “parking space”.  This gives that VM priority over others of the same type when it needs to start because it already has a “parking space” assigned to it that can't be used by another one.

More detail about VM startup

Before we get further into what ODCRs are and how they work, it’s important to know a few more things about starting up a VM. 

Azure does not provide an explicit SLA for VM startup for virtual machines without an ODCR.  The process of finding a hypervisor slot to boot up a VM is purely a “best effort” action on Azure’s part.

Having quota headroom does not help with VM startup.  Quota in Azure is your "credit limit" for creating VMs.  Quota grants permission to create up to a certain number of cores’ worth of Virtual Machines from a particular family (like Ds_v6) but has no effect on whether you can actually start the machine once it’s created.

Similarly, having a Reserved Instance purchase or a Savings Plan for a particular number of cores of a given VM family does not have any impact on the ability to start a VM either.  These mechanisms are a discount mechanism only where the customer pre-pays for a certain amount of VM cores to be running 24x7 at a discounted rate.

Assigning an ODCR to a virtual machine applies a formal SLA on startup for it.  VMs with ODCRs get priority over ones that don’t so the likelihood of a successful startup is much higher for VMs that have one compared to those that do not, especially during times when Azure is experiencing a period of high demand for that particular VM type.  The actual language of the ODCR SLA can be found in Microsoft's Service Level Agreements for Online Services document which can be downloaded from the linked site.

 

Cost Implications of ODCRs

These are the key points that you need to know about how billing works for ODCRs:

  1. The compute cost for the parking space capacity reservation for a VM is exactly the same as a running VM of the same size. There is no “double billing” for a VM to have an ODCR associated with it.

  2. Billing for the ODCR starts immediately if the quantity of reserved "parking spaces" is greater than zero.

  3. Stopping a VM that has an ODCR associated with it does not impact cost. This is because the ODCR is holding the reserved hypervisor slot even if the VM is not running.

  4. Having a Reserved Instance purchase or Savings Plan which covers the same scope as the ODCR means that the VM will be billed at the discounted rate.

Are there any cases where using ODCRs results in paying more for a VM?

There are two cases that I’ve identified where you pay for two ODCRs for the same VM. 

First, if you are using Azure Site Recovery to protect a VM in Azure by replicating it to another location, you have the option to associate the remote replica of the VM with a capacity reservation.  This helps ensure that the replica will start when it’s called upon because it has a pre-allocated spot reserved for it.  In this situation, if the original VM also is associated with an ODCR you are paying for both the original (running) VM and also for the reservation being held for its replica.

Second, and similarly, when setting up replication for a VM that is preparing for migration into Azure via Azure Migrate, you can associate a capacity reservation with the replica for similar reasons to the above ASR example -- to ensure that the VM will start when its migrated replica is activated.  If the source machine is also in Azure then you are again paying twice for the same machine.

When should I use them?

Capacity Reservations are an important element when designing for resiliency.  They help ensure that VMs will be online when needed, even if they have to be shut down for some reason.  For example, there was an incident where a customer had to shut down a VM that was serving as a firewall appliance to make an adjustment to its configuration and it failed to start up afterwards because of a capacity-related failure.  This resulted in significant impact due to the loss of connectivity for systems dependent on the firewall for connectivity until they were able to bring it back online.

Based on field experience and resiliency assessments, applying ODCRs to VMs that must be available 24x7 is strongly recommended. Examples of this include key functions like AD domain controllers, application servers and database servers.  Also, any VM-based appliances that may be running as firewalls, load balancers or other infrastructure-support services should be considered as well.

Microsoft offers assessments which review a workload for gaps that impact resiliency in many dimensions including outages in Azure.  These assessments include checks for the presence of capacity reservations and will report any VM’s that do not have them as a high-risk finding.

Not all VM stops in Azure are voluntary

Even if you are careful to never stop a VM yourself it can sometimes happen.  Not every shutdown of a VM in Azure is user-initiated.  Involuntary shutdowns are rare but they can occur due to predictive hardware failures or other events which ARM will respond to by stopping the VM in order to move it out of harm's way.  

Creating On-Demand Capacity Reservations

This section covers the components of an ODCR, the process of creating them and why creating them can fail.

Components of an ODCR:

An ODCR has two components to it.  The first part is a Capacity Reservation Group (CRG) which is simply a "bucket" for any number of capacity reservations.  To create a CRG you only need to provide its name, the region that it will be used for and which availability zones within that region it will have access to.

The second -- and more important -- component is the actual Capacity Reservation which is created within a CRG.  The capacity reservation requires:

  • The name of the reservation. Including the VM size and other details in the name is useful to reduce ambiguity.  An example could be “Zone1_D16s_v5”

  • The specific VM size the reservation is for, such as “D16s_v5”

  • The availability zone of the reservation. You can also create a regional reservation, where the VM is “zoneless”, as well.

  • The number of parking spaces instances that the reservation holds.

ODCRs can be created via the Azure portal, from the command line using PowerShell or the Azure CLI or deployed through IaC tools such as Bicep or Terraform.  CRGs also can also be shared across subscriptions, which allows a CRG created and managed in one subscription to be utilized by VMs in a different subscription.

When the ODCR is created, if the number of instances it contains is higher than zero then ARM will attempt to allocate the desired number of instances of the specified VM type in the target region/zone.  If there is capacity available for this then the creation succeeds and you can move on to associating machines with it to give them the protection of the ODCR. 

If creating the ODCR is unsuccessful, the cause can be a variety of things, including:

  • No open hypervisor slots for the desired VM in the target location – the “parking lot” was full at the moment the request was submitted. This can result from outages within Azure that reduce capacity as well as demand pressure.

  • There is insufficient quota in the subscription to claim the necessary number of VM cores for the reservation in the region.

  • The VM type is simply not available in the target region or AZ.  Since not all Azure regions are provisioned with identical hardware this can be the cause, especially for VM types other than the popular D, E and F series machines.

  • A restriction is applied to the subscription, zone or region that blocks creation of the reservation for some reason.

What you can do if creating an ODCR fails

Some things that may help if creating a capacity reservation fails and you know that quota or other restrictions are not a factor are below. 

Not coincidentally, these are the same recommendations that you should try when a VM fails to start because the same ARM action – finding and allocating hardware with free capacity to start the VM – is taking place.

  1. IN GENERAL, creating an ODCR outside of business hours has a higher probability of success.  Demand for Azure services typically drops off at the end of the business day where the region is located.

  2. Consider using a different VM type, availability zone or a different Azure region.

  3. A script or other automation that retries at intervals until the reservation succeeds in claiming the desired number of spots can help, though it can take an unknown amount of time before this works.  It may need to run for days or even weeks before it succeeds.

Submitting a support ticket will create visibility to your situation from Microsoft.  If the root cause is something other than capacity, support can identify that cause and provide guidance on how to resolve it.  If the issue truly is a capacity squeeze, the ability of support to help get the reservation created started is extremely limited because the support folks, while helpful, are not able to create space where none exists.  In this case the support teams will often suggest the three options above.

Protecting a VM with an ODCR

Once you have the ODCR created, applying it to a VM is straightforward.  To do this from the portal, open the configuration tab on the VM’s screen.  Then scroll to the bottom of the panel that appears to find the “Capacity reservations” section.  Select “Capacity reservation group” from the list.  The list of capacity reservation groups that match the VM will appear in a drop-down menu below.  Select the CRG that the VM should use and click “Apply”.

If you are using an Infrastructure-as-Code approach such as Bicep or Terraform, an Azure VM is linked to a CRG by specifying the resource ID of the CRG  in the appropriate property on the VM definition.

Impact of associating a virtual machine with an ODCR:

  • If the VM is not running then the change takes effect immediately.

  • If the VM is running and has no zone assignment (a “regional” VM) then it must be stopped and restarted for the protection of the ODCR to apply.

  • If the VM is running and has a zone assignment then the change is immediate and there is no disruption to the VM.

 

Where an ODCR is not the right answer

ODCRs are most effective when they are used to protect VMs that need to always be running because they are providing essential services.  Examples include AD domain controllers, firewall or load balancer appliances, database servers, integration servers that support workflows and the like.

The primary thing to keep in mind is the cost impact of the ODCRs and whether they are necessary for the service to be functioning.  Environments where machines come and go frequently, such as scale in/out setups used to minimize cost, are not ideal for ODCRs.  For example, if you have a pool of app servers configured for scale-out, using ODCRs to cover the entire size of the pool means you would be paying for all machines, whether they are actually online or not.  A possible approach in a scale-out environment is to determine the minimum number of VMs necessary for the service to be available -- even in a degraded state -- and use an ODCR to protect that number of instances.  This way you can have confidence that at least that number of machines in the pool will always be running even if an attempt to scale out fails.

 

Working with On-Demand Capacity Reservations
(and three interesting behaviors that you should know about)

 

This section discusses some ins and outs of working with ODCRs in your environment, especially if you need to apply them to existing machines.  This is a common scenario when you are attempting to improve the resiliency of a set of VMs against impacts from maintenance, outages or other situations that may cause VMs to restart.

“Associated” vs “Allocated”

A capacity reservation group will always have ownership of some number of "parking spots" within a region.  The number that it holds is referred to as the reservation's capacity which is expressed as a number of allocated instances.

When you link a VM to a CRG, the VM becomes associated with the CRG and can take advantage of the protection that it offers from matching reservations that it contains. 

It is possible to associate more VMs to a CRG than it has allocated capacity for.  This is called overallocation

When a CRG is overallocated, the VMs associated with it are protected on a first-come-first-served basis based on when they were started.  If, for example, there are four VMs associated with a CRG but the CRG only has an allocated capacity of two, the first two associated machines which were started will receive protection but the others will not.

“Interesting” On-Demand Capacity Reservation behavior #1:

Here is the first of three interesting behaviors that you can use to your advantage when working with ODCRs.

You can add a running VM to a capacity reservation group.

As mentioned previously, if the VM is zonal then the change is immediate and nondisruptive.  If the VM is regional then the VM must be stopped and restarted for the change to take effect.

This is conceptually different from other Azure mechanisms used for resiliency such as Availability Sets.  You can only add a VM to an availability set at the time the VM is created but you can add or remove a VM from a Capacity Reservation Group at any time whether the VM is running or not.

“Interesting” On-Demand Capacity Reservation behavior #2

Interesting behavior #2 is deceptively simple.  When creating a reservation, you can specify a capacity (number of allocated instances) of zero. 

This should always succeed because Azure needs to take no action to fulfill it -- this is just a metadata adjustment for the reservation within the CRG.

This seems to not be terribly useful at first glance but keep reading.

“Interesting” On-Demand Capacity Reservation behavior #3

If the number of associated VMs is higher than the allocated capacity of the reservation, you can increase the capacity of the reservation to cover the running VMs.

Why does this work?  Because running VMs, by definition, have a parking spot hypervisor allocation already so Azure doesn’t need to find one for it -- Azure can simply link the capacity reservation to the hypervisor slot that the running VM is using.

The payoff!  Or, using these three behaviors to your advantage

Because ODCRs are relatively new and have not yet been adopted widely, a common finding to emerge from field resiliency assessments of running workloads is that the VMs that support the workload need to have ODCRs applied to them.  In large environments there may be dozens or even hundreds of VMs that need to be protected.  The process for doing this can seem daunting to a technical team that is not familiar with ODCRs. 

Thankfully, these three behaviors make it possible to easily protect any number of running machines with a very high probability of success -- and zero disruption if they are zonal VMs -- by proceeding in this order:

 

  1. Create a CRG with a reservation for the region, AZ and VM type for the machine(s) that need to be covered with a quantity of zero. (Interesting behavior #2)

  2. Associate the VMs to the capacity reservation group. At this point the CRG is overallocated so the machines are not yet protected.  Remember that if the VMs are regional, a restart is required to finalize the ODCR assignment.  (Interesting behavior #1)

  3. Update the reservation within the CRG to increase the number of allocated instances to match the number of running VMs. (Interesting behavior #3)

When the number of instances on the reservation is equal to or higher than the number of VMs associated with it, all of the associated VMs are protected and you’re done!

Final thoughts

This leads to a final piece of advice about working with ODCRs, especially when you know that capacity is a challenge in the target region:  As a field CSA, I recommend that you bring VMs online first, then apply a capacity reservation to them.

Why?  If you already have a set of running VMs that need to be protected then following what seems like the obvious process: Creating a CRG, creating reservations within it for the correct number of instances and then associating the VMs with the reservation – has a risk of failure at the step of creating the ODCR because Azure needs to find and allocate additional hypervisor slots for the reservation to own.  This can be challenging when there is a lot of demand for the VM type. 

As the example in the previous section showed, it’s much easier to protect VMs that are already online by associating them with an existing capacity reservation, even if it doesn’t have enough instances allocated to it, and then increasing the capacity of the ODCR to cover the running machines.

 

References:

On-Demand Capacity Reservations Overview

SLA Details for On-Demand Capacity Reservations

Some details about Overallocating capacity reservations

Information on creating a Capacity Reservation Group via Bicep, Terraform or ARM template.

Updated Apr 02, 2026
Version 1.0
No CommentsBe the first to comment