changetracking
2 TopicsPart 2: Safely Cleaning Orphaned Records in Change Tracking Side Tables
Applies to: Azure SQL Database (Change Tracking enabled) Recap (Part 1) In Part 1, we covered how to detect “orphaned” records in Change Tracking (CT) side tables — rows whose sys_change_xdes_id no longer has a matching transaction entry in the commit table (sys.syscommittab). This situation often leads to unexpected CT growth and “stuck cleanup” symptoms because the mapping data required for normal cleanup is missing. Part 1 link: Identifying Orphaned Records in Change Tracking Side Tables (Read‑Only Health Check) Why Part 2 is needed A common “root pattern” we see in the field is: Side-table cleanup attempts to delete expired metadata Some side-table deletions fail (locks/timeouts/errors) Commit-table cleanup proceeds anyway (or a custom workflow deletes from commit table without validating side-table deletes) Remaining side-table rows now reference xdes_id values that no longer exist in sys.syscommittab → orphans Microsoft Learn also emphasizes that syscommittab cleanup depends on side-table cleanup — commit-table cleanup should happen only after side tables are cleaned. This Part 2 script focuses on removing orphaned rows from side tables (and does not touch sys.syscommittab), so cleanup logic can stabilize again. Important prerequisites & constraints (read this first) 1) Internal table access on Azure SQL Database In Azure SQL Database, customers may not be able to access certain internal CT artifacts directly (even when attempting DAC-style workflows). In the related case discussions, internal testing noted that self‑service cleanup against internal tables can be infeasible. 2) CHECKPOINT note (why it’s in the script) sys.dm_tran_commit_table exposes commit-table data and is backed by sys.syscommittab. Microsoft Learn notes that read-only users may not see live changes until a CHECKPOINT occurs. That’s why your script includes the optional CHECKPOINT comment before reading commit-table state. 3) Supported guidance vs. custom remediation Microsoft Learn provides official troubleshooting/mitigation guidance for CT cleanup issues (including checking dbo.MSChange_tracking_history, assessing stale rows, and using sp_flush_commit_table_on_demand for commit-table cleanup). Your script is a targeted remediation pattern for a specific failure mode (orphaned side-table rows). Use it carefully, test first, and follow organizational approval processes. What this script does (high-level) T-SQL script is essentially Part 1 detection + optional targeted delete generation: Compute the “safe cleanup point” from CT retention (wall-clock → CSN) using sp_changetracking_time_to_csn — the same concept used in Part 1. Enumerate CT side tables via sys.internal_tables where internal_type = 209 (CT side tables). For each side table, identify candidate orphaned transaction IDs (sys_change_xdes_id) that: are older than a computed boundary (@minXdesId derived from sys.dm_tran_commit_table), and have no matching xdes_id in sys.syscommittab at/before the cleanup point Print orphan counts per side table using RAISERROR … WITH NOWAIT (operator-friendly, streaming output). Safety cross-check: abort if any “orphan” unexpectedly exists in sys.syscommittab (defensive sanity gate) Generate a DELETE statement for the current side table (execution is commented out) The script (Part 2) — “detect + generate delete” Below is T-SQL script. I kept the delete step disabled by default, so it remains safe to share. (You can enable execution only after approvals/testing.) -- use <[DBName]> -- switch to the right database -- run checkpoint first to ensure all in-memory commit table data is persisted to disk (syscommittab) -- checkpoint SET NOCOUNT ON -- find the invalid clean version based on configured retention DECLARE time DATETIME, @csn BIGINT = 0, @minCleanupPoint BIGINT = 0 DECLARE @retention_period INT, @retention_period_units NVARCHAR(10) SELECT @retention_period = retention_period, @retention_period_units = retention_period_units FROM sys.change_tracking_databases where database_id = DB_ID() SELECT time = CASE WHEN @retention_period_units = 1 then DATEADD(minute, (-1 * @retention_period), GETUTCDATE()) WHEN @retention_period_units = 2 then DATEADD(hour, (-1 * @retention_period), GETUTCDATE()) ELSE DATEADD(day, (-1 * @retention_period), GETUTCDATE()) END EXEC sp_changetracking_time_to_csn time = time, @csn = @csn OUTPUT SELECT @minCleanupPoint = @csn SELECT @minCleanupPoint as minCsn -- 688118 -- iterate over all the change tracking side tables DECLARE @sideTable SYSNAME; DECLARE ct_cursor CURSOR FAST_FORWARD FOR SELECT name FROM sys.internal_tables WHERE internal_type = 209; -- internal_type = 209 is for change tracking side tables OPEN ct_cursor; FETCH NEXT FROM ct_cursor INTO @sideTable; WHILE @@FETCH_STATUS = 0 BEGIN -- find the minimum expired xdes id declare @minXdesId BIGINT SELECT @minXdesId = min(xdes_id) FROM sys.dm_tran_commit_table where commit_ts <= @minCleanupPoint -- SELECT @minXdesId as minXdes -- create temp table for storing orphaned xdes id DROP TABLE IF EXISTS #OrphanedXdes; CREATE TABLE #OrphanedXdes ( sys_change_xdes_id BIGINT NOT NULL ); DECLARE @sql NVARCHAR(MAX); SET @sql = N' INSERT INTO #OrphanedXdes(sys_change_xdes_id) SELECT ct.sys_change_xdes_id FROM sys.' + QUOTENAME(@sideTable) + N' AS ct WHERE ct.sys_change_xdes_id < @minXdesId AND NOT EXISTS ( SELECT 1 FROM sys.syscommittab AS s WHERE s.xdes_id = ct.sys_change_xdes_id AND s.commit_ts <= @minCleanupPoint );'; EXEC sys.sp_executesql @sql, N'@minXdesId BIGINT, @minCleanupPoint BIGINT', @minXdesId = @minXdesId, @minCleanupPoint = @minCleanupPoint; DECLARE @orphanedIdsCount BIGINT; SET @sql = N' SELECT @cnt = COUNT_BIG(sys_change_xdes_id) FROM #OrphanedXdes; '; EXEC sys.sp_executesql @sql, N'@cnt BIGINT OUTPUT', @cnt = @orphanedIdsCount OUTPUT; -- Raise error if any orphaned xdes exists IF (@orphanedIdsCount > 0) BEGIN DECLARE @msg NVARCHAR(4000) = @sideTable + N' : ' + CONVERT(NVARCHAR(30), @orphanedIdsCount); RAISERROR (@msg, 16, 1) WITH NOWAIT; DECLARE @newLine NVARCHAR(10) = CHAR(13) + CHAR(10) PRINT (@newLine) END -- Cross-check that no xdes should exist in syscommittab -- !!!IMPORTANT!!! RAISE an error and stop the cleanup if it does SET @sql = N' DECLARE @nonMatchingXdesCount BIGINT; SELECT @nonMatchingXdesCount = COUNT_BIG(*) FROM #OrphanedXdes AS ct WHERE EXISTS ( SELECT 1 FROM sys.syscommittab AS s WHERE s.xdes_id = ct.sys_change_xdes_id ); -- SELECT @nonMatchingXdesCount as nonMatchingXdesCount IF (COALESCE(@nonMatchingXdesCount, 0) > 0) BEGIN TRY DECLARE @msg NVARCHAR(1024); SET @msg = N''Cleanup aborted: orphan cross-check failed for side table [' + @sideTable + N'].''; RAISERROR(@msg, 16, 1) WITH NOWAIT; RETURN; END TRY BEGIN CATCH THROW; END CATCH '; EXEC sys.sp_executesql @sql; IF (@orphanedIdsCount > 0) BEGIN -- Prepare the query to delete the orphaned rows from the side table SET @sql = N'DELETE ct FROM sys.' + @sideTable + N' ct WHERE EXISTS (SELECT 1 FROM #OrphanedXdes AS o WHERE o.sys_change_xdes_id = ct.sys_change_xdes_id);'; SELECT @sql -- validate the delete query is correctly generated -- Sample delete statement: DELETE ct FROM sys.change_tracking_1221579390 ct WHERE EXISTS (SELECT 1 FROM #OrphanedXdes AS o WHERE o.sys_change_xdes_id = ct.sys_change_xdes_id); -- NOTE: Uncomment the below query to execute the delete statement and remove the orphaned records -- EXEC sys.sp_executesql @sql; END DROP TABLE IF EXISTS #OrphanedXdes; FETCH NEXT FROM ct_cursor INTO @sideTable; END CLOSE ct_cursor; DEALLOCATE ct_cursor; SET NOCOUNT OFF Why the “cross-check abort” is a good idea Notice the safety gate: You first define orphans as those not existing in sys.syscommittab (for the cleanup horizon). Then you re-check: “If any of these appear in syscommittab, abort.” This prevents an accidental delete if: the cleanup horizon math is wrong, the environment has unexpected visibility differences, or the temp-table contents are not what you expect. That defensive posture aligns well with the general principle documented in Microsoft Learn: commit-table cleanup should only occur after side-table cleanup, and troubleshooting should be data-driven and cautious. How to interpret the output If you see no RAISERROR lines, the script did not find orphaned rows under the defined criteria. If you see: change_tracking_<id> : <count> that indicates <count> orphaned transaction references in that CT side table. This is the same style used in Part 1 for long-running, streaming progress. Next steps (recommended order) Confirm CT configuration (retention + AUTO_CLEANUP status) using official guidance. Run Part 1 / Part 2 detection to quantify scope (which side tables, how many). If you need to remediate: Prefer supported mitigations where possible (for example, disable/enable CT on a table to purge tracking metadata for that table is listed as a “quickest remedy” in Microsoft Learn for certain cleanup lock conflict scenarios). If table-level disable/enable isn’t acceptable, use an approval-driven approach for targeted cleanup. Closing Orphaned CT side-table records are one of those “silent growth” conditions that can be easy to miss until storage or CHANGETABLE performance becomes painful. Part 1 helps you spot the issue early; Part 2 helps you prepare a safe, targeted cleanup workflow — with explicit safety gates and a delete step that remains disabled by default. References Identifying Orphaned Records in Change Tracking Side Tables (Read‑Only Health Check) Troubleshoot change tracking auto cleanup issues sys.dm_tran_commit_table (Change Tracking)Troubleshooting Change Tracking cleanup growth and orphaned rows in Azure SQL Database
Applies to: Azure SQL Database Scenario: Change Tracking (CT) side tables grow unexpectedly, and “orphaned” rows appear after switching between auto-cleanup and custom/scheduled manual cleanup. The problem (what we observed) In the case tracked, the discussion focused on Change Tracking cleanup behavior—including unexpected growth in CT side tables and orphaned records. The customer also referenced earlier guidance to move away from auto-cleanup due to locking concerns during upgrades, and the team needed to propose safe next steps quickly. A parallel concern was that CT auto-cleanup could block DDL during upgrades (schema lock behavior), which triggered work to deploy a fix and validate it in a lab before broader rollout. Why this is tricky The engagement highlighted that manual cleanup and auto-cleanup can behave differently in real-world, high-scale environments (large number of CT-enabled tables, heavy activity, and operational constraints like access and auditing). Investigation efforts included: validating where orphaned rows exist and how many CT side tables are affected, checking whether auto-cleanup is enabled/disabled, and using auditing / Extended Events to identify who/what is dropping related history objects. Additionally, Snapshot Isolation can prevent cleanup from progressing in some cases. it was noted that long-running snapshot transactions can prevent a safe cleanup point from advancing, which can block removal of expired entries from internal commit tracking tables until those transactions complete. Practical troubleshooting steps (what helped) 1) Confirm CT configuration (retention + auto-cleanup) Use the Change Tracking configuration options to validate retention and whether auto-cleanup is enabled. Microsoft Learn documents enabling CT at the database level (including CHANGE_RETENTION and AUTO_CLEANUP). Enable and Disable Change Tracking - SQL Server | Microsoft Learn 2) Quick backlog signal: commit table “oldest commit_time” During the investigation, the team used a lightweight query to sanity-check backlog in the commit table: SELECT TOP (1) * FROM sys.dm_tran_commit_table ORDER BY commit_time ASC; if the returned commit_time is close to the retention horizon, auto-cleanup is likely keeping up (this query doesn’t require DAC). 3) Detect orphaned rows in CT side tables (read-only script) A key artifact from this case is the following T-SQL script, which calculates a cleanup point based on configured retention and then iterates over CT side tables (sys.internal_tables where internal_type = 209) to identify rows whose sys_change_xdes_id no longer has a matching entry in sys.syscommittab at/below the cleanup point. -- use <[DBName]> -- switch to the right database SET NOCOUNT ON -- find the invalid clean version based on configured retention DECLARE time DATETIME, @csn BIGINT = 0, @minCleanupPoint BIGINT = 0 DECLARE @retention_period INT, @retention_period_units NVARCHAR(10) SELECT @retention_period = retention_period, @retention_period_units = retention_period_units FROM sys.change_tracking_databases where database_id = DB_ID() SELECT time = CASE WHEN @retention_period_units = 1 then DATEADD(minute, (-1 * @retention_period), GETUTCDATE()) WHEN @retention_period_units = 2 then DATEADD(hour, (-1 * @retention_period), GETUTCDATE()) ELSE DATEADD(day, (-1 * @retention_period), GETUTCDATE()) END EXEC sp_changetracking_time_to_csn time = time, @csn = @csn OUTPUT SELECT @minCleanupPoint = @csn SELECT @minCleanupPoint as minCsn -- 688118 -- iterate over all the change tracking side tables DECLARE @sideTable SYSNAME; DECLARE ct_cursor CURSOR FAST_FORWARD FOR SELECT name FROM sys.internal_tables WHERE internal_type = 209; -- internal_type = 209 is for change tracking side tables OPEN ct_cursor; FETCH NEXT FROM ct_cursor INTO @sideTable; WHILE @@FETCH_STATUS = 0 BEGIN -- find the minimum expired xdes id declare @minXdesId BIGINT SELECT @minXdesId = min(xdes_id) FROM sys.dm_tran_commit_table where commit_ts <= @minCleanupPoint -- SELECT @minXdesId as minXdes -- create temp table for storing orphaned xdes id DROP TABLE IF EXISTS #OrphanedXdes; CREATE TABLE #OrphanedXdes ( sys_change_xdes_id BIGINT NOT NULL ); DECLARE @sql NVARCHAR(MAX); SET @sql = N' INSERT INTO #OrphanedXdes(sys_change_xdes_id) SELECT ct.sys_change_xdes_id FROM sys.' + QUOTENAME(@sideTable) + N' AS ct WHERE ct.sys_change_xdes_id < @minXdesId AND NOT EXISTS ( SELECT 1 FROM sys.syscommittab AS s WHERE s.xdes_id = ct.sys_change_xdes_id AND s.commit_ts <= @minCleanupPoint );'; EXEC sys.sp_executesql @sql, N'@minXdesId BIGINT, @minCleanupPoint BIGINT', @minXdesId = @minXdesId, @minCleanupPoint = @minCleanupPoint; DECLARE @orphanedIdsCount BIGINT; SET @sql = N' SELECT @cnt = COUNT_BIG(sys_change_xdes_id) FROM #OrphanedXdes; '; EXEC sys.sp_executesql @sql, N'@cnt BIGINT OUTPUT', @cnt = @orphanedIdsCount OUTPUT; -- Raise error if any orphaned xdes exists IF (@orphanedIdsCount > 0) BEGIN DECLARE @msg NVARCHAR(4000) = @sideTable + N' : ' + CONVERT(NVARCHAR(30), @orphanedIdsCount); RAISERROR (@msg, 16, 1) WITH NOWAIT; DECLARE @newLine NVARCHAR(10) = CHAR(13) + CHAR(10) PRINT (@newLine) END DROP TABLE IF EXISTS #OrphanedXdes; FETCH NEXT FROM ct_cursor INTO @sideTable; END CLOSE ct_cursor; DEALLOCATE ct_cursor; SET NOCOUNT OFF Here’s the high-level logic (excerpted/annotated from the script): Read retention settings from sys.change_tracking_databases Convert “retention window” to a cleanup CSN using sp_changetracking_time_to_csn For each CT side table (sys.internal_tables internal_type = 209): compare side-table sys_change_xdes_id vs. sys.syscommittab and count “orphaned” xdes ids emit a message when orphaned counts are present Tip: This is read-only diagnostic logic. In your environment, validate permissions and impact before running in production. 4) If auto-cleanup is unexpectedly disabled, re-enable and monitor In the email thread, the team observed auto-cleanup was disabled in at least one environment and recommended re-enabling it, then monitoring the CT history table to confirm cleanup activity resumes. 5) Use auditing / Extended Events to identify unexpected object drops When investigating why a “history table” disappeared, the team reviewed extended event data and noted evidence of a specific application context associated with the drop (shared in the meeting discussion). This is a key lesson: without auditing, it can be difficult to determine who/what disabled auto-cleanup or dropped relevant objects; the email thread explicitly called this out. Mitigation options discussed (often safest): disable & re-enable Change Tracking on affected tables As an alternative to running manual deletion scripts, the troubleshooting recommended disabling and re-enabling Change Tracking on the set of tables containing orphaned rows—described as a well-established and safer cleanup method that avoids needing elevated access to run cleanup scripts directly. Trade-off: disabling CT on a table removes existing change data from the corresponding side tables for that table. About auto-cleanup performance improvements (why “stay on auto” may be preferred) The troubleshooting included discussion that auto-cleanup is the area that continues to receive improvements and that performance enhancements exist (for example, improved adaptive behavior in newer SQL Server versions). Microsoft Learn describes that SQL Server 2025 introduces an “adaptive shallow cleanup approach” for large side tables, enabled by default, and explains how cleanup behavior changes compared to prior versions. While Azure SQL Database implementation details differ from boxed SQL Server, the key operational takeaway from the discussion was: if possible, prefer auto-cleanup and avoid manual cleanup unless you have a strong reason, because manual cleanup may lack telemetry and can be harder to reason about at scale. Key takeaways Start with read-only validation: use a backlog signal query and an orphan-detection script to quantify scope before making changes. Auditability matters: without auditing/trace evidence, identifying who disabled auto-cleanup or dropped related objects is difficult. Snapshot Isolation can block cleanup progress: long-running snapshot transactions may prevent safe cleanup from advancing. Keep snapshot transactions short where possible. A safe mitigation exists disable/re-enable Change Tracking on affected tables (with awareness of change data loss) can be safer than running deletion scripts. References Microsoft Learn: Change Tracking management overview (permissions, internal tables, behavior considerations). Manage Change Tracking - SQL Server | Microsoft Learn Microsoft Learn: Enable and Disable Change Tracking (retention + auto-cleanup configuration). Enable and Disable Change Tracking - SQL Server | Microsoft Learn