Let us continue our troubleshooting by checking threads in this part.
sp_replcounters
GO
The above shows reader thread replicating on average 115 transactions per second and more than 7.5mln transactions are waiting to be replicated to the distribution database. On average, transactions are waiting 134880secs to be replicated, which is high latency.
SELECT
SessionId = s.session_id,
App = ISNULL(s.program_name, N'')
FROM sys.dm_exec_sessions s
WHERE s.program_name LIKE '%LogReader%'
CREATE EVENT SESSION [LogReaderMonitor] ON SERVER
ADD EVENT sqlos.wait_completed(
ACTION(package0.callstack)
WHERE ([sqlserver].[session_id]=(123))), ---Change session id here
ADD EVENT sqlos.wait_info_external(
ACTION(package0.callstack)
WHERE (([opcode]=('End')) AND ([sqlserver].[session_id]=(123)))), ---Change session id here
ADD EVENT sqlserver.rpc_completed(
ACTION(package0.callstack)
WHERE ([sqlserver].[session_id]=(123))) ---Change session id here
ADD TARGET package0.event_file(SET filename=N'C:\Temp\logreader_reader_track',max_file_size=(256),max_rollover_files=(5))
WITH (MAX_MEMORY=8192 KB,EVENT_RETENTION_MODE=ALLOW_MULTIPLE_EVENT_LOSS,MAX_DISPATCH_LATENCY=30 SECONDS,MAX_EVENT_SIZE=0 KB,MEMORY_PARTITION_MODE=NONE,TRACK_CAUSALITY=ON,STARTUP_STATE=OFF)
GO
If wait time is high compared to CPU time, check wait type and troubleshoot accordingly. For example, on the above example we faced MEMORY_ALLOCATION_EXT wait_type but duration is 0. So, we are not waiting.
If CPU time is higher, this means log thread is running but latency is being observed because you have high load. High load can be caused by several causes:
If Reader Latency is caused by large number of pending commands, waiting for the Log Reader to catch up may be the best short-term solution. Long-term options include replicating batches during non-peak time.
-- Get publisher db id
USE distribution
GO
SELECT * FROM dbo.MSpublisher_databases
-- Get commands we are at
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
GO
BEGIN TRAN
USE distribution
GO
EXEC Sp_browsereplcmds
@xact_seqno_start = 'xact_seqno',
@xact_seqno_end = 'xact_seqno',
@publisher_database_id = PUBLISHERDB_ID
COMMIT TRAN
GO
SELECT
SessionId = s.session_id,
App = ISNULL(s.program_name, N'')
FROM sys.dm_exec_sessions s
WHERE s.program_name LIKE '%LogReader%'
sp_who2
CREATE EVENT SESSION [logreader_writer_track] ON SERVER
ADD EVENT sqlos.wait_completed(
ACTION(package0.callstack)
WHERE ([sqlserver].[session_id]=(64))), -- Change session id to log reader writer session id
ADD EVENT sqlos.wait_info_external(
ACTION(package0.callstack)
WHERE (([opcode]=('End')) AND ([sqlserver].[session_id]=(64)))), -- Change session id to log reader writer session id
ADD EVENT sqlserver.sp_statement_completed(
ACTION(package0.event_sequence,sqlserver.plan_handle,sqlserver.session_id,sqlserver.transaction_id)
WHERE ([sqlserver].[session_id]=(64))) -- Change session id to log reader writer session id
ADD TARGET package0.event_file(SET filename=N'C:\Temp\logreader_writer_track',max_file_size=(256),max_rollover_files=(5))
WITH (MAX_MEMORY=8192 KB,EVENT_RETENTION_MODE=ALLOW_MULTIPLE_EVENT_LOSS,MAX_DISPATCH_LATENCY=30 SECONDS,MAX_EVENT_SIZE=0 KB,MEMORY_PARTITION_MODE=NONE,TRACK_CAUSALITY=ON,STARTUP_STATE=OFF)
SELECT * FROM sys.dm_exec_query_plan(PLAN_HANDLE)
SELECT
SessionId = s.session_id,
App = ISNULL(s.program_name, N'')
FROM sys.dm_exec_sessions s LEFT OUTER JOIN sys.dm_exec_connections c ON (s.session_id = c.session_id)
WHERE (select text from sys.dm_exec_sql_text(c.most_recent_sql_handle)) LIKE '%sp_MSget_repl_command%'
sp_who2
CREATE EVENT SESSION [distributor_writer_track] ON SERVER
ADD EVENT sqlos.wait_completed(
ACTION(package0.callstack)
WHERE ([sqlserver].[session_id]=(64))), -- Change session id to dist agent session id
ADD EVENT sqlos.wait_info_external(
ACTION(package0.callstack)
WHERE (([opcode]=('End')) AND ([sqlserver].[session_id]=(64)))), -- Change session id to dist agent session id
ADD EVENT sqlserver.sp_statement_completed(
ACTION(package0.event_sequence,sqlserver.plan_handle,sqlserver.session_id,sqlserver.transaction_id)
WHERE ([sqlserver].[session_id]=(64))) -- Change session id to dist agent session id
ADD TARGET package0.event_file(SET filename=N'C:\Temp\distributor_reader_track',max_file_size=(256),max_rollover_files=(5))
WITH (MAX_MEMORY=8192 KB,EVENT_RETENTION_MODE=ALLOW_MULTIPLE_EVENT_LOSS,MAX_DISPATCH_LATENCY=30 SECONDS,MAX_EVENT_SIZE=0 KB,MEMORY_PARTITION_MODE=NONE,TRACK_CAUSALITY=ON,STARTUP_STATE=OFF)
SELECT * FROM sys.dm_exec_query_plan(PLAN_HANDLE)
High CPU time can often mean there is a high load which can be caused by large batch of replicated transactions. You can compare the number of commands and transactions by using the below query.
SELECT count(c.xact_seqno) as CommandCount, count(DISTINCT t.xact_seqno) as TransactionCount
FROM MSrepl_commands c with (nolock)
LEFT JOIN msrepl_transactions t with (nolock)
on t.publisher_database_id = c.publisher_database_id and t.xact_seqno = c.xact_seqno
WHERE c.publisher_database_id = 1 --Change to target database id here
For the past days’ statistics, you can leverage below command:
USE distribution
select t.publisher_database_id, t.xact_seqno,
max(t.entry_time) as EntryTime, count(c.xact_seqno) as
CommandCount, count(DISTINCT t.xact_seqno) as TransactionCount
into #results
FROM MSrepl_commands c with (nolock)
LEFT JOIN msrepl_transactions t with (nolock)
on t.publisher_database_id = c.publisher_database_id
and t.xact_seqno = c.xact_seqno
GROUP BY t.publisher_database_id, t.xact_seqno
SELECT publisher_database_id
,datepart(year, EntryTime) as Year
,datepart(month, EntryTime) as Month
,datepart(day, EntryTime) as Day
,datepart(hh, EntryTime) as Hour
,sum(CommandCount) as CommandCountPerTimeUnit
,sum(TransactionCount) as TransactionCountPerTimeUnit
FROM #results
GROUP BY publisher_database_id
,datepart(year, EntryTime)
,datepart(month, EntryTime)
,datepart(day, EntryTime)
,datepart(hh, EntryTime)
ORDER BY publisher_database_id, Month, Day, Hour
As you see, I am executing one command per transaction making TransactionCount nearly equal to CommandCount.
SELECT
SessionId = s.session_id,
App = ISNULL(s.program_name, N'')
FROM sys.dm_exec_sessions s
WHERE s.program_name LIKE '%publish%'
GO
sp_who2
CREATE EVENT SESSION [distributor_writer_track] ON SERVER
ADD EVENT sqlos.wait_completed(
ACTION(package0.callstack,sqlserver.session_id,sqlserver.sql_text)
WHERE ([sqlserver].[client_app_name]=N'SQLVM4-TRANSACR_AdventureWorksLT_test_table_pub' AND [package0].[greater_than_uint64]([duration],(0)))),
ADD EVENT sqlos.wait_info_external(
ACTION(package0.callstack,sqlserver.session_id,sqlserver.sql_text)
WHERE ([sqlserver].[client_app_name]=N'SQLVM4-TRANSACR_AdventureWorksLT_test_table_pub' AND [package0].[greater_than_uint64]([duration],(0)))),
ADD EVENT sqlserver.sp_statement_completed(
ACTION(package0.event_sequence,sqlserver.plan_handle,sqlserver.session_id,sqlserver.transaction_id)
WHERE ([sqlserver].[client_app_name]=N'SQLVM4-TRANSACR_AdventureWorksLT_test_table_pub'))
ADD TARGET package0.event_file(SET filename=N'C:\Temp\distributor_writer_track',max_file_size=(5),max_rollover_files=(5))
WITH (MAX_MEMORY=8192 KB,EVENT_RETENTION_MODE=ALLOW_MULTIPLE_EVENT_LOSS,MAX_DISPATCH_LATENCY=30 SECONDS,MAX_EVENT_SIZE=0 KB,MEMORY_PARTITION_MODE=NONE,TRACK_CAUSALITY=ON,STARTUP_STATE=OFF)
GO
Compare wait time (duration – cpu_time) and cpu_time. If wait time is high compared to CPU time, check wait_type and troubleshoot accordingly. For example, above we face NETWORK_IO wait_type. If CPU time is high, you can investigate the execution plan by using corresponding plan_handle for time consuming query which you can get from above event logs:
SELECT * FROM sys.dm_exec_query_plan(PLAN_HANDLE)
Supervisor: Collin Benkler, Sr EE for SQL Server in Microsoft
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.