First published on MSDN on Jul 25, 2017
The Azure SQL DataSync workflow is carried by different tasks at different stages. The first task is to provision destination databases (Initial Synchronization). Then data changes (update/insert/delete) are propagated to the destination by sync tasks. If any changes are made in dataset or schema, re-provisioning occurs. Deprovisioning is engaged when a database is removed from the DataSync topology. Below are some “best practice” recommendations for each of the tasks:
The provisioning task creates customer table schemas defined in the dataset. For each table, a tracking table and a set of stored procedures and triggers are also created; then the tables are populated. After provisioning, the source and destination database will have the same data defined by the dataset.
Please do not remove those system objects, it could break the sync and leave the database in an unpredictable condition.
For the best performance, whenever possible, start the initial synchronization with data in only the source databases. If you have data in both the source and destination tables, the provisioning task treats each row as a conflict and pursues conflict resolution, causing the provisioning to be much slower.
Avoid Synchronization loops. Synchronization loops are formed if the databases are added in two or more sync groups such that a change in a database in one sync group is re-written to the same database by another sync group. There are two scenarios that lead to a synchronization loop:
A circular reference involving two or more sync groups.
A single database registered with different agents and added to a sync group under two or more agents.
Schedule the sync task carefully, based upon the amount of data to sync. If the current scheduled sync is not completed in the scheduled interval, the sync won’t start at the next scheduled time. This leads to skipping one schedule and cumulating more data at the next. The result could be a much longer delay than expected.
The sync task will check the cancellation flag after each batch is applied. If necessary, you can safely stop the sync while in progress. The next batch will be applied when you start the sync again. Please be aware that when you cancel the sync, the sync group is left in a partially synced state. Resume the sync as soon as possible.
During the on-going course of the data sync, if any of the operations below are performed, the database will go through re-provisioning:
Add or remove tables
Add or remove columns
Change the width of a column
Change the data type of a column
Unlike the sync task, the re-provision task does not check the cancelation flag. After the task starts, it will continue until complete. Any “stop sync” issued during the process will be ignored.
Since re-provision tasks are resource intensive, you may consider scaling the database to higher performance level while re-provisioning to ensure the task completes as soon as possible. You may also consider grouping the re-prevision changes in smaller units where possible and implement one unit at a time.
Removing a database from a Sync Group will start de-provisioning, it will remove all the tracking tables, stored procedures, and triggers created in the user database as well as the metadata information on the service side.
If the de-provisioning fails for any reason, an error status on the database is displayed. You can manually perform the task by using DeprovisioningUtil.exe that is installed with the local agent.
Default path: C:\Program Files (x86)\Microsoft SQL Data Sync 2.0\bin
A sync group could be set to “out-of-date” when any change within the sync group fails to propagate to the rest of the sync group for 45 days or more. When this occurs, you need to delete the sync group, go through de-provisioning and recreate it, starting from initial synchronization.