Configuring a Disaster Recovery Solution for Azure Service Bus with Basic Tier

Copper Contributor

Introduction

Disaster recovery (DR) is crucial for ensuring business continuity and minimizing downtime. While the Azure Service Bus Basic tier doesn't support advanced Geo-disaster recovery (Geo-DR) or Geo-Replication(Public Preview) features like the Premium tiers, you can still implement a custom DR strategy. This guide will walk you through setting up a disaster recovery solution for Azure Service Bus using the Basic tier.

 

Prerequisites

Before starting, make sure you have:

  • An Azure subscription.
  • Two Azure Service Bus namespaces (one primary and one secondary) in different regions.
  • Access to the Azure portal.
  • Familiarity with Azure CLI or PowerShell for automation purposes.

 

Step-by-Step Guide

Step 1: Create Primary and Secondary Namespaces

  1. Create the Primary Namespace:

    • Go to the Azure portal.
    • Search for "Service Bus" and select "Create Service Bus namespace".
    • Enter a name for the namespace (e.g., primary-ns-basic), choose the Basic tier, and select the primary region.
    • Click "Review + create" and then "Create".
  2. Create the Secondary Namespace:

    • Repeat the steps to create a secondary namespace in a different region (e.g., secondary-ns-basic).

Step 2: Synchronise Messages Between Namespaces

Since the Basic tier does not support Geo-DR, you'll need to manually synchronise messages between the primary and secondary namespaces. This can be achieved through custom code or third-party tools.

  1. Implement Message Synchronisation:

    • Create an application that listens to messages on the primary namespace and republishes them to the secondary namespace.
    • Use Azure Functions or a similar service to trigger this application whenever a new message arrives.
    • Ensure the application handles any potential issues, such as message duplication or order.
  2. Sample Synchronization Code (Azure Functions with C#):

 

using System;
using System.Text;
using System.Threading.Tasks;
using Microsoft.Azure.ServiceBus;
using Microsoft.Azure.WebJobs;
using Microsoft.Extensions.Logging;

public static class MessageSynchroniser
{
    private static string primaryConnectionString = "<PrimaryNamespaceConnectionString>";
    private static string secondaryConnectionString = "<SecondaryNamespaceConnectionString>";
    private static string queueName = "<QueueName>";
    private static IQueueClient secondaryQueueClient;

    [FunctionName("MessageSynchroniser")]
    public static async Task Run([ServiceBusTrigger(queueName, Connection = "primaryConnectionString")] Message message, ILogger log)
    {
        secondaryQueueClient = new QueueClient(secondaryConnectionString, queueName);

        try
        {
            var secondaryMessage = new Message(Encoding.UTF8.GetBytes(message.Body))
            {
                ContentType = message.ContentType,
                Label = message.Label,
                MessageId = message.MessageId,
                CorrelationId = message.CorrelationId,
                UserProperties = message.UserProperties
            };

            await secondaryQueueClient.SendAsync(secondaryMessage);
            log.LogInformation($"Message synchronised to secondary namespace: {message.MessageId}");
        }
        catch (Exception ex)
        {
            log.LogError($"Error synchronising message: {ex.Message}");
        }
        finally
        {
            await secondaryQueueClient.CloseAsync();
        }
    }
}

 

Step 3: Failover Procedure

In the event of a disaster, you will need to manually failover to the secondary namespace.

  1. Update Connection Strings:

    • Modify your application configuration to point to the secondary namespace's connection string.
    • Restart your applications to ensure they connect to the secondary namespace.
  2. Communicate the Change:

    • Notify your team and stakeholders about the failover.
    • Monitor the secondary namespace to ensure it is handling the load appropriately.

Step 4: Failback to Primary Namespace

Once the primary region is operational again, you can switch back to the primary namespace.

  1. Resynchronise Messages:

    • Ensure that any messages in the secondary namespace are synchronised back to the primary namespace.
    • Use the same message synchronisation approach as before but in reverse.
  2. Update Connection Strings:

    • Change your application configuration back to the primary namespace's connection string.
    • Restart your applications to point back to the primary namespace.

Best Practices

  • Regular Testing: Periodically test your disaster recovery plan to ensure it works as expected.
  • Automation: Automate as much of the DR process as possible to minimise downtime and human error.
  • Monitoring: Set up monitoring and alerts for both primary and secondary namespaces to detect issues early.
  • Documentation: Keep detailed documentation of your DR processes and ensure your team is familiar with them.

 

Conclusion

While the Azure Service Bus Basic tier lacks built-in Geo-DR capabilities, you can still create a robust disaster recovery solution through custom synchronization and failover procedures. By following the steps outlined in this guide, you can ensure your messaging infrastructure is resilient and prepared for any disruptions. Regular testing and monitoring will help maintain the effectiveness of your DR strategy.

 

Feel free to reach out if you have any questions or need further assistance. Happy configuring!

 

-- Santosh Patkar

 

3 Replies
when the message is consumed from primary to secondary while syncing will the message in primary queue will be served to application that we are primarily intended to deliver?
If a message is being synced from a primary queue to a secondary queue, and the message is consumed (or moved) to the secondary queue during this process, the primary queue typically would no longer have that message available. Consequently, the application intended to consume from the primary queue would not receive that specific message if it has already been moved to the secondary queue.

To ensure the application receives the intended message, you would need to carefully manage the syncing process, possibly by duplicating the message or confirming successful consumption before moving it.

@Rizwan8668 : Thank you for the reply. 

 

I have automated the process of creating azure app service and deploying web app and API to it (it's a DR plan since pricing is too high to use active active/passive architecture). And the application running in app service is going to use this messaging queues. Practically speaking, we don't require message sync, right? because as soon as primary region goes down and If I bring back the application into secondary region the app service in secondary region will be configured to secondary service bus name space and also primary service bus queue will be processed. The secondary just starts new queue.