This Blog details how to comprehensively monitor and effectively troubleshoot Change Data Capture (CDC) using extended events. CDC Performance can be monitored using the following two CDC extended events that were introduced in SQL Server 2022.:
These extended events were added to aid in troubleshooting and enable alerting while CDC jobs are running. These debug events provide the customer with a way to prove CDC configuration throughput and expose any errors encountered.
Change Data Capture (CDC) is currently available in SQL Server (all supported versions), Azure SQL Managed Instance, and Azure SQL Database. CDC records insert, update, and delete activity that applies to a table. This makes the details of the changes available in an easily consumed relational format. Column information and the metadata that is required to apply the changes to a target environment is captured for the modified rows and stored in change tables that mirror the column structure of the tracked source tables. Table-valued functions are provided to allow systematic access to the change data by consumers. Learn more here.
What do the new SQL 2022 CDC extended events track?
CDC jobs include:
Capture Job - Captures changes into capture table.
Clean-up Job - Removes older data from capture table.
Each event will cover the different CDC job activities.
This event reports change transactions as they flow into the change table. The following columns are returned detailing key metrics as data is captured into the change table:
The CDC process scans the transaction log in 7 distinct phases to completion as follows:
1: Reading configuration
2: First scan, building hash table
3: Second scan (Partial Rollback)
4: Second scan (Error)
5: Second scan (Partial Rollback Error)
6: Schema versioning
7: Last scan
session_id = 0, this value is always "Aggregate".
This event reports change transactions as they are purged to the retention point set for the change table. For long running CDC Cleanup jobs this event will report as each change table is cleared every 5 minutes. The following columns are returned detailing key metrics as data is removed from the change table:
The columns listed above along with a reporting timestamp, will be updated every 5 minutes (so long as a process takes 5 minutes). That is, if a change-table only takes 5 minutes to clear, telemetry will be returned once that table has been cleared. On the other hand, if a table were to take 6 minutes to clear, there would be a row returned after 5 minutes and an additional row with the final number of rows cleared at the end of the cleanup process for the table.
Once the last CDC change table has been cleared, the cleanup process will move on to clear records from the LSN_Time_Mapping table. The same 5-minute reporting process outlined above until the table is cleared. This design was intended to provide a still alive signal, as well as report on the progress for long running cleanup jobs.
This section will provide step by step instructions and scripts to enable CDC and monitor:
Ensure SQL Server Agent is configured.
CLR is required for CDC. Execute the following script to enable CLR:
USE master; GO --Enable CLR EXEC sp_configure 'clr enabled' , '1'; RECONFIGURE; --Ensure CLR was enabled: EXEC sp_configure 'clr enabled';
Generate a database by running the following script:
use [master] GO Create database MyDB; GO --Size the database for testing USE [master] GO ALTER DATABASE [MyDB] MODIFY FILE ( NAME = N'MyDB', SIZE = 102400KB , FILEGROWTH = 256000KB ) GO ALTER DATABASE [MyDB] MODIFY FILE ( NAME = N'MyDB_log', SIZE = 307200KB , FILEGROWTH = 256000KB ) GO
Generate a simple generic table by running the following script:
USE MyDB; GO CREATE TABLE T1( C1 BIGINT PRIMARY KEY, C2 NVARCHAR(22)); --Ensure tables were generated SELECT * from sysobjects where type = 'U';
Configure the table generate for CDC:
The schema owner of the database needs to be able to see all the CDC components. An easy workaround is to set the owner at least temporarily to ‘SA’ (it can be switched back to the actual schema owner once all the CDC tables are built.)
The following script shows how to enable CDC on a database:
-- Enable Database for CDC template exec sp_changedbowner [sa] -- ==== USE MyDB GO EXEC sys.sp_cdc_enable_db ; GO -- ========= -- Enable sample Tables for CDC -- ========= EXEC sys.sp_cdc_enable_table @source_schema = N'dbo', @source_name = N'T1', @role_name = NULL, @supports_net_changes = 1 GO --verify cdc tables have been generated. use myDB; select * from cdc.dbo_T1_CT; select * from sysobjects where type = 'U'; --check target table select * from T1;
The new empty CDC table will display no rows as pictured below.
Generate a repl_logscan_session event by running the following script:
USE master; GO --Update filename and path as appropriate CREATE EVENT SESSION [cdc_test] ON SERVER ADD EVENT sqlserver.repl_logscan_session ADD TARGET package0.event_file(SET filename=N'C:\output\xevents\cdc_test_event.xel') WITH (STARTUP_STATE=OFF) GO
At this juncture, we are ready to commit some transactions to our table and can watch as CDC runs through each phase of copying changes.
Verify that the event session was generated.
Open SSMS object explorer, the database instance, management, extended events, sessions:
Right click on the session generated earlier and select start session:
Then select Watch Live Data.
Note: We have not kicked off the cleanup job, nor have we made any changes yet, so we will just see the capture event scanning.
Now, right click the columns to show in table. E.g., right click on articles completed and select show in table for desired columns.
Continue adding columns to the table:
Notice the event displays the log scan session is running checking if there have been any changes to the table without any further configuration!
With no inserts to the change table, the session just sows log scans waiting for an update.
Generate CDC transactions (open a new SSMS query window) and execute the following script:
Note: Change the cnt1 limit to the desired number of rows (currently set to 255.)
--Start with the next row count and load table --with ascii character rows --Replace 255 with desired number of rows. --Note: arithmetic overflow at 1000000000 for ---int conversion so limit max rows. -- use myDB; DECLARE @cnt INT, @cnt1 INT select @cnt= (select count(*) from T1)+1; select @cnt1= (select count(*) from T1)+255; WHILE @cnt < @cnt1 BEGIN INSERT INTO T1 VALUES (@cnt, char(@cnt)); SET @cnt = @cnt + 1;
Switch back to the Live Data tab in SSMS to view telemetry from repl_logscan_session.
repl_logscan_session event shows:
Great! Everything is going well!
After performing steps through the previous section, verify all changes made to our source table “T1” were captured to our change table “cdc.dbo_T1_CT” by executing the following script:
select * from cdc.dbo_T1_CT;
Check the CDC jobs settings for capture and cleanup by running this procedure:
The capture job is set to run continuously with a polling interval of 5 seconds.
The clean-up job Retention for clean-up is set too long for our testing (4320 minutes).
To test the purge process for 3-minute retention reconfigure the cleanup job using the following script:
EXECUTE sys.sp_cdc_change_job @job_type = N'cleanup', @retention = 3 --three-minute retention GO
In most cases, retention would be longer e.g., the default 3 days (4320 minutes), or weekly, or monthly, or quarterly. 3 minutes to clear the change table makes for a convenient demonstration. Once the cleanup job has completed, we should have completely emptied the table. After these demos have completed, ensure to set the desired cleanup job retention setting as appropriate.
Wait at least 3 minutes.
Then run the clean-up job by right clicking on the job and select start job at step:
Once the job has completed the status window should show “Success” as pictured below.
Execute the script from the beginning of this section to verify the captured changes have been cleared from the change table:
Obviously, 3-minute retention is not a real-world setting, but we are testing clearing the change tables so 3 minutes will keep our demo moving and provides a good visual.
Using this telemetry event, we can also monitor partial transactions:
SQL Server transaction commands (see more on transaction control language):
repl_logscan_session will report scan phase 3 for any partial rollbacks.
To generate a partial rollback, the following script will:
The script below will generate a transaction partially committed using a save point. The save point ensures that rollbacks will not affect transactions we intend to commit later in the process. That is, the save point maintains our contract with the ACID properties.
(Note: Adjust insert values as needed, the T1 table during this exercise had 819 rows.)
--Check last row and beyond select *from T1 where C1 > 819; select * from cdc.dbo_T1_CT; ------- BEGIN TRAN; INSERT INTO T1 VALUES (820, 'A'); PRINT 'A'; SAVE TRANSACTION FirstInsert select @@TRANCOUNT INSERT INTO T1 VALUES (821, 'B'); PRINT 'B'; INSERT INTO T1 VALUES (822, 'C'); PRINT 'C'; GO --Roll them back ROLLBACK TRANSACTION FirstInsert select @@TRANCOUNT BEGIN TRAN; --2 INSERT INTO T1 VALUES (823, 'D'); PRINT 'D'; COMMIT --2 select @@TRANCOUNT PRINT 'Commit 2'; INSERT INTO T1 VALUES (824, 'E'); PRINT 'E'; PRINT 'Commit E'; --Note this would also work if we just executed another --commit, but the demo just looks better ending with --a command. select @@TRANCOUNT select *from T1 where C1 > 819; select * from cdc.dbo_T1_CT;
In SQL Server Management Studio (SSMS) check the Event’s Live Data tab:
Taking a closer look at the event columns.
CDC enters scan phase 3 - repl_logscan_session shows scan phase 3 indicating a Partial rollback. The save point issued after the rollback command indicates 1 log record but no commands executed.
Validate that the save-point transaction was committed into the source table by running the following script.
select *from T1 where C1 > 819;
Validate that the save-point transaction was committed into the change table by running the following script.
select * from cdc.dbo_T1_CT;
In this example, one change capture article contains 2 updates one with 2 commands and one with just 1 command. That is, A was saved as a partial update for the next commit, while B and C were rolled back. Finally, A and D were committed and finally E was committed closing out the open transaction at the beginning of the script.
Our partially committed A update made its way through the CDC process into the copy table.
Execute the following script to generate an event session for cdc_clean_up_job:
Update filename and path as needed.
CREATE EVENT SESSION [cdc_clean__up_job_test] ON SERVER ADD EVENT sqlserver.cdc_cleanup_job_status ADD TARGET package0.event_file(SET filename=N'C:\output\xevents\cdc_clean__up_job_test_event.xel') WITH (STARTUP_STATE=OFF) GO
Using SSMS, expand object explorer as described earlier, right-click and select start session:
Note: This event will start showing data if and only if the change data capture cleanup job is running.
If a cleanup job runs for more than 5 minutes this event provides a way to validate that the process is not encountering any hindrances and the error column will display any errors encountered.
Execute the following script a new table to perform a cleanup job test:
CREATE TABLE T3( C1 BIGINT PRIMARY KEY, C2 NVARCHAR(100) NULL);
Execute the following script to enable CDC on the new table T3:
EXEC sys.sp_cdc_enable_table @source_schema = N'dbo', @source_name = N'T3', @role_name = NULL, @supports_net_changes = 1 GO
Generate enough transactions to require the cleanup job to run for at least 10 minutes by running the following script:
USE myDB; --Start with the next row count and load table with ascii character rows --Note: arithmetic overflow at 1000000000 for int conversion --so don't go that far. DECLARE @cnt INT, @cnt1 INT select @cnt= (select count(*) from T3)+1; select @cnt1= (select count(*) from T3)+5500000; WHILE @cnt < @cnt1 BEGIN INSERT INTO T3 VALUES (@cnt, '500000'); SET @cnt = @cnt + 1; END;
Execute the following script to get a starting point before running the cleanup job:
select * from T3; select * from cdc.dbo_T3_CT;
Using SSMS, Initiate the cleanup job by selecting “Start Job at Step”.
Return to the Watch live data tab in SSMS and in 5 minutes the cdc_cleanup_job_status job will add a row to the viewer. Right click on the details fields to show in the table view.
After 5 minutes we get a row for this job run with nothing in the cleanup errors column.
The job ran for just over 20 minutes on the test system.
(Keep in mind, the retention level set was 3 minutes and we waited 3 minutes before running cleanup, therefore the capture table should be empty.)
Verify the cleanup job by running the script:
select count(*) from T3; select count(*) from cdc.dbo_T3_CT;
The cleanup job successfully cleared all rows from the capture table entered more than 3 minutes ago.
This document reviewed how to monitor the CDC processes using new extended events.
Capture Job - Captures changes into capture table.
Clean-up Job - Removes older data from capture table.
We demonstrated that repl_logscan_session details capture job progress:
We watched 5-minute updates using the cdc_cleanup_job_status to check progress as the cleanup job ran over a 20-minute period.
What is change data capture(CDC)?
Enable and Disable change data capture – SQL Server
Change Data Capture Tables (Transact-SQL) – SQL Server
Work with Change Data – SQL Server
A more real-world example of CDC enabling change data to be consumed easily and systematically is pictured below and is covered here.
If you have feedback or suggestions for improving this CDC asset, please contact the Azure Databases SQL Customer Success Engineering Team.
Thanks for your support!
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.