First published on MSDN on Jul 13, 2018
Recently a customer reported an interesting issue, while querying against recently added readable replica, SELECT statement is shown as suspended and session is shown as waiting on HADR_DATABASE_WAIT_FOR_TRANSITION_TO_VERSIONING
Here is a simple way to reproduce the scenario
Now on current primary replica, change one of the remote replica to readable
However, running a SELECT Statement against newly available readable replica, query appeared to be "suspended"
Upon more investigation, it appeared to be waiting on with a wait type HADR_DATABASE_WAIT_FOR_TRANSITION_TO_VERSIONING
The behavior is by design as mention in the SQL Server product documentation and applicable to all version of SQL Server that supports availability group.
"As soon as a readable secondary replica joins the availability group, the secondary replica can start accepting connections to its secondary databases. However, if any active transactions exist on a primary database, row versions will not be fully available immediately on the corresponding secondary database. Any active transactions that existed on the primary replica when the secondary replica was configured must be committed or rolled back. Until this process completes, the transaction isolation level mapping on the secondary database is incomplete and queries are temporarily blocked."
This signals that there is at least one transaction open on the primary system which had been opened before you switched the secondary to become a readable secondary. As mentioned, when switching to a readable secondary, we need to provide committed versions before the Redo Thread is performing changes to the data. However, with a transaction still open on the primary that has been open before switching to readable secondary, such a committed version of the data could not be provided on the secondary. Hence the select will wait until the transition is successful. The transition becomes successful, when all the transactions that were open before switching are committed or rolled back on the primary.
Blog Post
Core Infrastructure and Security Blog
2 MIN READ
AlwaysOn Availability Groups: Wait Type HADR_DATABASE_WAIT_FOR_TRANSITION_TO_VERSIONING
ChiragShah-MSFT
Microsoft
Mar 15, 20191 Comment
Sort By
- EitanBluminBrass Contributor
We experienced this in production today after having to restart our secondary replica due to maintenance.
Everything got stuck after it was brought back up due to old open transactions on the primary.
This article helped me pinpoint the root cause.
I used the query below to find the exact offending sessions:
select trn.name as tran_name, trn.transaction_begin_time , duration = ISNULL(NULLIF(CONVERT(varchar(100), DATEDIFF(dd,0, GETDATE() - trn.transaction_begin_time)), 0) + 'd,', '') + CONVERT(varchar(100), GETDATE() - trn.transaction_begin_time, 114) , strn.session_id , t.event_info as inputbuffer , ses.program_name , ses.host_name , ses.login_name , ses.host_process_id from sys.dm_tran_active_transactions as trn left join sys.dm_tran_session_transactions AS strn on trn.transaction_id = strn.transaction_id outer apply sys.dm_exec_input_buffer(strn.session_id, default) as t left join sys.dm_exec_sessions as ses on strn.session_id = ses.session_id where trn.transaction_begin_time < DATEADD(minute, -10, GETDATE()) -- set to time before secondary became readable order by transaction_begin_time asc