Home
Microsoft

I got a recent case where a customer was worried with high transaction delays on an Azure SQL DB around 15sec and that would be really bad

 

When using premium DBs, Azure SQL DB will use AlwaysOn behind the scene to provide you HA

https://docs.microsoft.com/en-us/azure/sql-database/sql-database-high-availability#premium-and-busin...

 

 

clipboard_image_6.png

 
 

 

Customer identified this transaction delays using a tool called TELEGRAF. This tools was already commented by the SQLCAT team. Real-time performance monitoring for Azure SQL Database Managed Instance 

 

In the image below we can see those peaks:

 

clipboard_image_0.png

 

Considering customer was seeing transaction delay I went to SQL waits to check exactly what was SQL waiting for. The wait HADR_SYNC_COMMIT is responsible to track transaction delays

https://blogs.msdn.microsoft.com/sql_server_team/troubleshooting-high-hadr_sync_commit-wait-type-wit...

 

However looking in the server waits I could NOT see it as a TOP wait and that was not looking right

clipboard_image_1.png

 

We suspected that was an issue with the tool, and as this is a open source I was able to search it to identify what exactly the tool was measuring (https://github.com/influxdata/telegraf/blob/adc32002da4a7dcbcf9b0ac15f34922b10372fdc/plugins/inputs/...) and it was looking on sys.dm_os_performance_counters, counter "Transaction Delay"

 

This was documented on our docs as "Transaction Delay - Delay in waiting for unterminated commit acknowledgement, in milliseconds.". The main problem was not the counter but is the understand of what is exactly is measured by this counter.

 

The solution to this problem is that you need to divide (Transaction Delay) / (Mirrored Write Transactions/sec) to get AVG DELAY. This means customer had 15 seg of wait for ALL transactions on that measured second 

 

 

More information below

https://blogs.msdn.microsoft.com/saponsqlserver/2013/04/24/sql-server-2012-alwayson-part-12-performa...

 

Running in synchronous availability mode, the best way to find out what the delay introduced is, is to check two performance counters on the primary replica:

 

SQL Server:Database Replica –> Transaction Delay

SQL Server:Database Replica –> Mirrored Write Transactions/sec

The first value is an accumulation of the delay of all the current transaction delay in millisecond

 

The second counter is the value of transactions which indeed caused data transfer with AlwaysOn (in opposite to SQL Server:Database – Transactions/sec which does include pure read transactions as well).

 

In order to get to the average transaction (in principle commit delay) one only has to divide both values. In the screen shot below we look at an average Transaction Delay of 392 milliseconds (green line in 0.1 scale) and around 60 Write Transactions/sec (red line in scale 1.0). Calculating the numbers, we look at a transaction delay of around 6.5ms.

 

 

After found the information below I requested an update to official doc to avoid future confusion

clipboard_image_2.png