Azure SQL Data Sync is commonly used to synchronize data across Azure SQL Databases and on‑premises SQL Server environments. While the service works well in many scenarios, customers may occasionally encounter a situation where a Sync Group remains stuck in a “Progressing” state and cannot be started, stopped, or refreshed.
This blog walks through a real-world troubleshooting scenario, highlights the root cause, and outlines practical remediation steps based on actual support investigation and collaboration.
Problem Overview
In this scenario, the customer reported that:
- The Sync Group was stuck in “Progressing” for multiple days
- Sync operations could not be started or stopped
- Tables could not be refreshed or reconfigured
- Azure Activity Logs showed operations as Succeeded, yet sync never progressed
- Our backend telemetry showed the Sync Group as Active, while hub and member databases were in Reprovisioning state
The last successful sync occurred on XX day, after which the sync pipeline stopped making progress.
Initial Investigation Findings
During the investigation, several key observations were made:
1. High DATA IO Utilization
Telemetry and backend checks revealed that DATA IO utilization was pegged at 100% on one of the sync member databases starting XX day.
Despite no noticeable change in application workload, the database was under sustained IO pressure, which directly impacted Data Sync operations.
2. Deadlocks During Sync Processing
Our backend telemetry showed repeated deadlock errors:
Transaction was deadlocked on lock resources with another process and has been chosen as the deadlock victim.
These deadlocks were observed for multiple Sync Member IDs starting the same day IO saturation began.
This aligned with the hypothesis that resource contention, not a Data Sync service failure, was the underlying issue.
3. Metadata Database Was Healthy
The Sync metadata database was running on a serverless Azure SQL Database (1 vCore) and showed healthy resource usage, ruling it out as a bottleneck.
Recommended Troubleshooting Steps
Based on the findings, the following steps were recommended and validated:
✅ Step 1: Address Database Resource Constraints First
Before attempting to recreate or reset the Sync Group, the focus was placed on resolving DATA IO saturation on the affected database.
Actions included:
- Scaling up the database (DTUs / vCores)
- Monitoring IO utilization after scaling
- Ensuring sufficient headroom for sync operations
This was identified as the primary remediation step.
✅ Step 2: Use the Azure SQL Data Sync Health Checker
The Azure SQL Data Sync Health Checker was recommended to validate:
- Sync metadata integrity
- Table-level configuration issues
- Agent and connectivity status
GitHub tool: AzureSQLDataSyncHealthChecker
✅ Step 3: Validate Sync Group and Agent State via PowerShell
PowerShell was used to confirm:
- Sync Group state
- Last successful sync time
- On‑premises Sync Agent connectivity
Example commands used:
Get-AzureRmSqlSyncGroup `
-ResourceGroupName "ResourceGroup01" `
-ServerName "Server01" `
-DatabaseName "Database01" | Format-ListGet-AzureRmSqlSyncAgent `
-ResourceGroupName "ResourceGroup01" `
-ServerName "Server01" |
Select ResourceGroupName, SyncState, LastSyncTime
Resolution
After the customer increased the database size, DATA IO utilization dropped, sync operations resumed normally, and the customer confirmed that the issue was resolved.