First published on MSDN on Apr 05, 2018
Writer: Simon Su
Technical Reviewer: Pam Lahoud, Sourabh Agarwal, Tejas Shah
Applies to: SQL Server 2014 SP2, SQL Server 2016 SP1, SQL Server 2017 RTM and later
In synchronous-commit mode AG nodes sometimes you may observe your transactions are pending on HADR_SYNC_COMMIT waits. HADR_SYNC_COMMIT waits indicate that SQL server is waiting for the signal from the remote replicas to commit the transaction. To understand the transaction commit latency you can refer to below articles:
Troubleshooting High HADR_SYNC_COMMIT wait type with Always On Availability Groups - Microsoft Tech ...
SQL Server 2012 AlwaysOn – Part 12 – Performance Aspects and Performance Monitoring II - Microsoft T...
In above link you will learn that the transaction delay can be evaluated by below two performance counters:
For example, assume there are poor performing AG nodes and you see “SQL Server:Database Replica –> Transaction Delay” is 1000 ms (milliseconds), and “SQL Server:Database Replica –> Mirrored Write Transactions/sec” is 50, then it means on average each transaction has a delay of 1000ms/50=20 ms.
Given above example, can we know where the delay of 20 ms is from? What are the factors that cause this latency? To find out the answers of this kind of questions we need to understand how synchronous-commit works:
AlwaysON - HADRON Learning Series: - How does AlwaysON Process a Synchronous Commit Request - Micros...
To track the data movement between replicas we are lucky we have updated xevents:
KB3173156 - Update adds AlwaysOn extended events and performance counters in SQL Server 2014 or 2016...
In synchronous-commit mode the basic logic of the log blocks movement is as below:
1.1 and 1.2 happen in parallel ----------------------------- 1.1 Log block->LogCache->LogFlush->LocalHarden(LDF) 1.2 Log block->logPool->LogCapture->SendToRemote
In remote synchronous replica:
The harden process is similar as primary:
The xevents (see the first link in this article) happened in different places of log block movement. I have below figure to give you the detailed log movement flow of each steps and those related Xevents:
As showed above when the Xevent trace is captured we can know the precious time point of each step of the log block movement and you can know exactly where the transaction latency is from. Commonly the delay is from three parts:
1. The duration of log harden in primary
It equals to the time delta of Log_flush_start(step 2) and Log_flush_complete (step 3)
2. The duration of log harden in remote replica
It equals to the time delta of Log_flush_start (step 10) and Log_flush_complete (step 11)
3. The duration of network traffic
The sum of time deltas of (primary:hadr_log_block_send_complete ->secondary:hadr_transport_receive_log_block_message, step 6-7) and (secondary:hadr_lsn_send_complete->primary:hadr_receive_harden_lsn_message,step 12-13)
I use below script to capture xevents:
/* Note: this trace could generate very large amount of data very quickly, depends on the actual transaction rate. On a busy server it can grow several GB per minute, so do not run the script too long to avoid the impact to the production server. */
CREATE EVENT SESSION [AlwaysOn_Data_Movement_Tracing] ON SERVER ADD EVENT sqlserver.file_write_completed, ADD EVENT sqlserver.file_write_enqueued, ADD EVENT sqlserver.hadr_apply_log_block, ADD EVENT sqlserver.hadr_apply_vlfheader, ADD EVENT sqlserver.hadr_capture_compressed_log_cache, ADD EVENT sqlserver.hadr_capture_filestream_wait, ADD EVENT sqlserver.hadr_capture_log_block, ADD EVENT sqlserver.hadr_capture_vlfheader, ADD EVENT sqlserver.hadr_db_commit_mgr_harden, ADD EVENT sqlserver.hadr_db_commit_mgr_harden_still_waiting, ADD EVENT sqlserver.hadr_db_commit_mgr_update_harden, ADD EVENT sqlserver.hadr_filestream_processed_block, ADD EVENT sqlserver.hadr_log_block_compression, ADD EVENT sqlserver.hadr_log_block_decompression, ADD EVENT sqlserver.hadr_log_block_group_commit , ADD EVENT sqlserver.hadr_log_block_send_complete, ADD EVENT sqlserver.hadr_lsn_send_complete, ADD EVENT sqlserver.hadr_receive_harden_lsn_message, ADD EVENT sqlserver.hadr_send_harden_lsn_message, ADD EVENT sqlserver.hadr_transport_flow_control_action, ADD EVENT sqlserver.hadr_transport_receive_log_block_message, ADD EVENT sqlserver.log_block_pushed_to_logpool, ADD EVENT sqlserver.log_flush_complete , ADD EVENT sqlserver.log_flush_start, ADD EVENT sqlserver.recovery_unit_harden_log_timestamps ADD TARGET package0.event_file(SET filename=N'c:\mslog\AlwaysOn_Data_Movement_Tracing.xel',max_file_size=(500),max_rollover_files=(4)) WITH (MAX_MEMORY=4096 KB,EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,MAX_DISPATCH_LATENCY=30 SECONDS,MAX_EVENT_SIZE=0 KB,MEMORY_PARTITION_MODE=NONE,TRACK_CAUSALITY=OFF,STARTUP_STATE=ON)
For demo purpose I just run “insert into [AdventureWorks2014]..t1 values(1)” and then capture the xevent trace on primary and secondary, and below are the screenshot of the captured xevents:
In second synchronous replica:
Note: You may notice that the log_block_id (146028889512) of hadr_receive_harden_lsn_message is not the same as others (146028889488). This is because the return id is always the next immediate id of the harden log block, we can use hadr_db_commit_mgr_update_harden xevent to correlate the xevents:
With above xevent log we now get below detailed time latency breakdowns of the transaction commit:
I list the time delta (the latency) just for network and log harden process there, there could be other latency places happening as well, like log block compressing/decompressing etc, but mainly the latency comes from these three parts:
To get the total transaction delay, we cannot simply sum them up because Log flush on primary and the network transfer are happening in parallel. Say, the network takes 4.494 seconds, but primary log harden finished (log_flush_complete:2018-03-06 16:56:28.8785928) far before the primary gets confirmation from replica (hadr_receive_harden_lsn_message:2018-03-06 16:56:33.3732126). Luckily, we do not need to manually determine which timestamp to use to calculate the total commit time of a transaction. We can use the time delta between the two hadr_log_block_group_commit xevents to know the time to commit. For example, in above log:
Primary: hadr_log_block_group_commit: 2018-03-06 16:56:28.2167393
Primary: hadr_log_block_group_commit: 2018-03-06 16:56:33.3732847
Total time to commit=delta of above two timestamps= 5.157 seconds
This number is equal to the network transfer time plus the log harden time on the secondary. This makes sense because the secondary has to wait for the network for the log block available before log harden, it cannot harden the log in parallel like in the primary.
If you look at the second hadr_log_block_group_commit event it has a column of “processing_time” which is the exactly the commit time of the transaction that we are talking about:
So now you have an overall picture about the log block movement between synchronous-commit mode replicas, and you know where is the latency (if any) from, replicas, network, or disk (log harden) or others.
By the way, you may notice that there is “hadr_db_commit_mgr_harden_still_waiting” xevent happening in primary Xevents. This event happens every 2 seconds (the 2 seconds is hardcoded) when the primary is waiting for the acknowledge message from second replica. If the ack comes back within 2 seconds you won’t see this xevent.
I have developed a tool to analyze the trace automatically:
AGLatency report tool V2.2 introduction | Microsoft Docs
New in SSMS - Always On Availability Group Latency Reports | Microsoft Docs
AlwaysOn Availability Groups Troubleshooting and Monitoring Guide | Microsoft Docs
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.