Understanding Azure storage redundancy offerings

Published Jun 15 2020 09:46 AM 18.4K Views
Microsoft

Before we go deeper into the storage redundancy space it'd be helpful to better understand the building blocks of Azure global infrastructure as well as a few terms commonly used in high availability and disaster recovery in general.

Data residency boundary.PNG

  • Geography - a discrete market, typically containing 2+ regions, that preserves data residency and compliance boundaries.
  • Azure region - a set of datacenters deployed within a latency-defined perimeter and connected through a dedicated regional low-latency network.
  • Region pair - each Azure region is paired with another region within the same geography.
  • Availability zone - a physically separate location within an Azure region. Each AZ is made up of 1+ datacenters with independent power, cooling, and networking.
  • Availability - defined by Gartner as the the assurance that an enterprise’s IT infrastructure has suitable recoverability and protection from system failures, natural disasters or malicious attacks. High availability refers to a system that is operational without interruption for long periods of time by using redundant or fault-tolerant components and is typically measured as a percentage.
  • Recovery point objective (RPO) - the amount of data which can be lost while bringing the system back online after a critical failure, i.e. the point in time to which the data can be recovered.
  • Recovery time objective (RTO) - the amount of time that it takes to get the system back online after a critical failure, i.e. how long you can sustain a service interruption before you absolutely need to be back online.

 

With these in mind let's take a closer look at what Azure storage redundancy options have to offer.

Locally redundant storage (LRS)

LRS.PNG

  • Data is synchronously replicated 3 times within a single storage cluster, in a single data center in a region = can only sustain node failure within the storage cluster.
  • Provides at least 11 9s of durability and 99.9% of availability (reads & writes) for hot tier and 99% for cool.

Zone-redundant storage (ZRS)

ZRS.PNG

  • Data is synchronously replicated 3 times across 3 availability zones in a region = can sustain node failure within the storage cluster or entire datacenter or availability zone going down.
  • Provides at least 12 9s of durability and 99.9% of availability (reads & writes) for hot tier and 99% for cool.

Geo-redundant storage (GRS)

GRS.PNG

  • Data is synchronously replicated 3 times within a single storage cluster in the primary region, then asynchronously replicated to the secondary paired region (3 more copies) = can sustain node failure within the storage cluster, entire datacenter or availability zone going down or a region-wide outage (DC/zone/region failure would require account failover to restore read and write availability - https://aka.ms/accountfailover).
  • Provides at least 16 9s of durability and 99.9% of availability (reads & writes) for hot tier and 99% for cool + 99.99% on reads for RA-GRS (read-access to the secondary endpoint).
  • Typically has an RPO of less than 15 minutes (no SLA).
  • Read access to the secondary is available if the primary region is down with RA-GRS.

Geo-zone-redundant storage (GZRS)

GZRS.PNG

  • Data is synchronously replicated 3 times across 3 availability zones in the primary region, then asynchronously replicated to the secondary paired region (3 more copies) = can sustain node failure within the storage cluster, entire datacenter or availability zone going down or a region-wide outage (only region failure would require account failover to restore read and write availability - https://aka.ms/accountfailover).
  • Provides at least 16 9s of durability and 99.9% of availability (reads & writes) for hot tier and 99% for cool + 99.99% on reads for RA-GRS (read-access to the secondary endpoint).
  • Typically has an RPO of less than 15 minutes (no SLA).
  • Read access to the secondary is available if the primary region is down with RA-GZRS.

Account failover

Account failover.PNG

  • Allows you to initiate the failover at the account level in case of an ongoing/upcoming disaster (certain restrictions apply - account failover considerations).
  • Generally available in all public regions.
  • Failover is disruptive and converts the account to LRS.
  • Typically has an RTO of less 1 hour (no SLA).

Failover timeline.PNGBe aware of potential data loss! Always check LastSyncTime before executing the failover.

 

Hope this helps you get a good grasp of durability and availability options for your storage needs! For more details please refer to our documentation:

We'd love to hear from you - please reach us out via email at azurestoragefeedback@microsoft.com and/or post to Azure storage feedback forum.

1 Comment
Version history
Last update:
‎Jun 15 2020 09:46 AM
Updated by: