Public Preview: Log Analytics Workspace Replication
Published May 24 2024 12:54 AM 2,817 Views
Microsoft

[This is a repost of the same blog released on the Azure Observability Blog], authored by Noa Kuperberg]

 

Azure Monitor Log Analytics uses workspaces as a logical container for logs. Workspaces are region-bound, but workspace replication allows you to create cross-regional redundancy to increase workspace resilience to regional incidents.

 

What is Workspace Replication?

Workspace replication creates a replica of your workspace on another region, that you chose from a set of regions. The original instance of your workspace is referred to as the primary workspace, and the replica on the second region is referred to as secondary.

The second instance of your workspace is created by the service with the same ID and configuration as your primary workspace (future configuration changes you make will be synced as well). This is basically an active-passive setup – at any given time, your workspace has one active instance, and another one that is updated in the background and can’t be directly managed or accessed.

The secondary instance of your workspace is created empty, logs that were ingested to your workspace before enabling replication are not copied over. When replication is enabled, new logs ingested to your workspace are replicated, so they are sent to both primary and secondary workspaces. This means your workspace has cross-regional redundancy.

If an incident impacts your primary workspace, causing issues like ingestion latency or query failures, you can trigger failover, to switch to your secondary workspace, which can allow you to continue monitoring your resources and apps as needed. By the time you switch to your secondary workspace, it will hold logs ingested since the time you enabled replication, so you can continue using alerts, workbooks and even Sentinel or other services that query your logs.

When the outage is mitigated and your primary workspace is healthy again, you can switch back to your primary region.

 

Note that replication isn’t free of charge, but is much more affordable than dual homing (ingesting to two workspaces places on different regions) and is easier to manage and maintain. When you enable replication, your logs are effectively ingested to 2 different regions, and billing is done per replicated GB. You can apply replication to a subset of your Data Collection Rules (DCRs) to control the replication volume, and related costs. See the Azure Monitor pricing page for more information.

 

What Workspace replication is not

Workspace replication is not a mechanism to copy a workspace and its content to another region, or move it.

  • Logs that were ingested to your primary workspace before enabling replication aren’t copied over
  • Your secondary workspace can’t exist if the primary is deleted.
  • When switching to the secondary workspace, you can’t change workspace settings, including its schema (add tables or columns). These operations can only be done on the primary workspace.

 

Why not use availability zones instead?

Availability zones provide redundancy of your workspace infrastructure across zones in a single region, and this is always recommended. Workspace replication doesn’t replace availability zones, it works differently, as it creates a replica of your workspace and new incoming logs on another region. This is valuable because:

  • Not all regions support availability zones. If your zone doesn’t have availability zones or the Azure Log Analytics service doesn’t yet use availability zones in your region – Workspace replication is the best way to create redundancy for your workspace operation.
  • Some customers require protection against incidents impacting the entire region. Availability zones are zones inside the same region. Yet, if an issue (even a bug) impacts the entire region, switching zones will not help.

 

Frequently asked questions

 

Who triggers the region switching? Is it done automatically?

Switching between regions isn’t done by Azure Monitor, and can only be triggered by you. This is because different incidents impact different workspaces to different degrees, and only you can decide when it’s time to switch over. For example, a 2-minute latency in ingestion of a specific data type may be a minor issue for some customers, but very significant to others.

You can create alert rules that will automatically switch regions according to ingestion latency, query success rate or other health measurements. Yet, we recommend that alerts be notify someone that will evaluate the situation and make an informed decision.

 

Do I need to reconfigure all my clients to support this?

No, you don’t need to reconfigure anything. The DNS will reroute all requests sent to the workspace to instead reach the secondary workspace.

 

Does it replicate the workspace with my data?

No. Only logs ingested after you enable replication will be replicated to your secondary workspace. Logs ingested before your enabled replication are not copied over.

 

But what if my workspace is linked to a dedicated cluster?

Cluster replication will be supported soon. We’ll share an update when this capability becomes available.

 

What about Sentinel, LogicApp and other services that use my workspace? Will they break when I switch regions?

Services and features that use your workspace continue working against the secondary workspace, seamlessly. Note switching regions allows your workspace operation to carry on, but it doesn’t handle other components of these services.

For more information, see the Workspace Replication documentation.

Co-Authors
Version history
Last update:
‎Jun 04 2024 01:36 AM
Updated by: