Lesson Learned #425:Azure SQL Database Failover Group Endpoint: A Deep Dive into DNS Resolution
Published Sep 11 2023 10:31 AM 2,463 Views

Today, we got a service request where our customer is facing connectivity issue using Azure SQL Database Failover Group Endpoint howerver using the IP of the Azure SQL Server Primary they didn't see any connectivity problem. 

 

The Azure SQL Database service offers robust features to ensure high availability and disaster recovery. One of these is the Failover Group, which permits seamless switchover between a primary and a secondary database. In this technical exploration, I would like to share with you how the Failover Group endpoint leverages DNS to route traffic and how we can determine to which IP it is currently resolving.

 

We found that the customer networking environment had several problem resolving the IP of the Azure SQL Server primary.

 

The Failover Group in Azure SQL Database

Before diving deep into DNS intricacies, it's imperative to understand the Failover Group's role. Azure's Failover Group is designed to manage two databases: a primary for read-write operations and a secondary, typically in a different region, for read-only or disaster recovery scenarios. In the event of a failure, the service automatically switches over, ensuring continuous data availability.

The Role of DNS in Traffic Routing

The endpoint of a Failover Group is, at its core, a DNS name. Here’s how it functions:

  1. Single Endpoint: Upon configuring a Failover Group, Azure provides a unique endpoint (URL). Regardless of failovers, this endpoint consistently points to the active primary database.

  2. DNS Resolution & TTL: The endpoint’s traffic direction operates via DNS. When a failover transpires, Azure updates the associated DNS record to reflect the current primary database. This mechanism hinges heavily on the DNS record's Time-to-Live (TTL). A lower TTL ensures rapid cache expiration, facilitating quicker traffic redirection during failovers.

  3. Propagation Delay: Despite the low TTL, there's an inevitable delay. During this interval, some clients, due to cached DNS entries, might still attempt connection to the old primary.

Determining the Endpoint's Resolving IP

If you're curious about where your Failover Group endpoint is currently directing traffic, various diagnostic tools are at your disposal:

  1. nslookup: A staple command-line tool for DNS queries. By running:

 

nslookup your-failover-group-endpoint.database.windows.net

 

2. dig: Common on Unix/Linux systems, dig offers verbose DNS query details. Use:

3. There are any other Online DNS tools. 

 

Enjoy!

Version history
Last update:
‎Sep 11 2023 03:32 AM
Updated by: