In this post we want to address a common ask and scenario we experience working with multiple customers. This is around the need to migrate databases between the various Azure SQL offerings as well as reverse migrate the databases back to on-premises SQL Servers running in the customer's environments. There can be multiple reasons for undertaking such migrations such as needing to move to Azure SQL Managed Instance from an existing Azure SQL Database implementation due to requirements around certain native functionality or the requirement to prove reverse migration capabilities to bring data back on-premises for compliance or regulatory reasons. In this post, we will cover the steps you can take to accomplish such patterns using an open-source tool called SmartBulkCopy and a Microsoft tool called SqlPackage.
All the scripts mentioned in this blog post can be downloaded from here and are enclosed within a .zip file and all descriptions and explanations are included below for each individual script.
All the below prerequisites must be met to successfully migrate the source database(s) to/ from Azure SQL DB, Azure SQL MI, Azure IaaS VM or SQL Server on-premise.
To successfully execute the included scripts the following tools need to be installed on a machine where the migration will be initiated:
The sections below describe the logical steps involved in enabling a database to be migrated, or reverse migrated between any SQL Server source and target. The migration is logical which means it includes data and schema migration where native backup restore capabilities cannot be leveraged due to the fact that they may not be supported (Azure SQL DB), or the backups are incompatible with a destination (SQL MI vs. SQL Server on VM/On-prem).
Migration between the same Azure SQL Service i.e., between Azure SQL DB logical servers or Azure SQL Managed Instances are able to leverage the native backup restore feature that service supports and therefore not in scope for this blog.
The first step starts with schema migration which is required before the data copy activity is initiated. The schema migration can be completed using SqlPackage, mssql-scripter or any similar tool. Usually, the most efficient and simplified way to perform schema migration is with SqlPackage (DacPac). However, mssql-scripter can be very useful in schema generation and deployment at a large scale where multiple databases are being moved.
SqlPackage must be preinstalled on the machine where the schema extraction and deployment are going to be initiated.
To extract the schema run the below command. To find out more about the SqlPackage extract parameters visit the site here.
Parameter name |
Description |
Action:Extract |
Extract the database schema.
|
ssn |
Source SQL Server name.
|
sdn |
Source SQL Server database name
|
tf |
Target file name (dacpac)
|
su |
Source user. The login used to connect to the source SQL database. |
sp |
Source user’s password. |
sqlpackage.exe /Action:Extract /MaxParallelism:8 /ssn:source_sql.database.windows.net /sdn:source_db /su:XXXXX /sp:XXXXX /tf:"C:\TEMP\source_db.dacpac" |
Before the schema is deployed an empty database must be created first on the destination server. To deploy the schema run the below command. To find out more about the SqlPackage publish parameters visit the site here.
Parameter name |
Description |
Action:Publish |
Publish the database schema
|
tsn |
Target SQL Server name
|
tdn |
Target SQL Server database name
|
sf |
Source file name (dacpac)
|
tu |
Target user. The login used to connect to the destination SQL database. |
tp |
Target user’s password. |
SqlPackage.exe /Action:Publish /MaxParallelism:8 /tsn:target_sql.database.windows.net /tdn:target_db /tu:XXXX /tp:XXXXXX /sf:"C:\TEMP\source_db.dacpac" /p:AllowIncompatiblePlatform=true
|
Mssql-scripter is useful if large numbers of databases must be scripted out. mssql-scripter can be used as good alternative to automate the schema generation and deployment in such scenarios.
The following parameters are used in the script (schema_generation.ps1) and the script for this step is available here
Parameter name |
Description |
$sourceSQL |
The source SQL Server.
|
$sourceUser |
The user sql account to connect to the source SQL Server.
|
$sourcePasswordPlain |
The password for the sourceUser account.
|
$databasename |
An array/list of databases to be scripted out.
|
$fileFolder |
Temporary file folder where all deployment scripts are generated.
|
$file |
PowerShell file generated by the script. It will store mssql-scripter commands. Do not change the name of the file. |
The following parameters are used in the script (schema_deployment.ps1) and the script is available here
Parameter name |
Description |
$targetSQLFqdn |
The Azure SQL Server name. For Azure SQL DB or MI use the FQDN
|
$targetLogin |
The Azure SQL DB user to connect to Azure SQL DB and deploy the database schema.
|
$targetPasswordPlain |
The password for the target login user.
|
$databasename |
An array / list of databases to be deployed.
|
$fileFolder |
Temporary file folder where all deployment scripts are stored.
|
$sqlcmdFolder |
The location of the sqlcmd utility e.g. C:\Program Files\Microsoft SQL Server\Client SDK\ODBC\170\Tools\Binn |
The target database should be initialized i.e., an empty database must be created as the 1st step, and then the schema should be deployed. However, the data copy procedure requires that all foreign key constraints are dropped prior to initiating the copy activity. Additionally, it’s recommended to drop all secondary indexes and load the data into a heap or clustered index structures. The attached SQL Scripts will ease and simplify the clean-up process and will recreate the dropped objects after the data migration has been completed as a final step
PLEASE NOTE:
The scripts to create the objects mentioned below can be found here
The following stored procedures must be deployed in the target database.
Do not deploy the following stored procedures in the source database.
Once all 4 procedures are deployed/created in the target database execute the following command in the target database if the secondary indexes should be dropped:
exec dbo.SmartBulkCopyMigrationMaintenance @execType = N'PREMIG', @secondaryIndexes=1
|
If the secondary (non-clustered) indexes should be untouched, please execute the following command in the target database. The secondary indexes may slow down the database copy activity but for very large tables it may reduce the overall migration time.
exec dbo.SmartBulkCopyMigrationMaintenance @execType = N'PREMIG', @secondaryIndexes=0
|
Please note the following tables are created by the SmartBulkCopyMigrationMaintenance stored procedure:
Before proceeding verify if there is no row in the dbo.SmartBulkCopyErrorCmds table. If any foreign key or secondary index could not be removed the corresponding command will be recorded in the table.
You may rerun the procedure (exec dbo.SmartBulkCopyMigrationMaintenance @execType = N'PREMIG') again and only errored commands will be re-executed. Successfully executed commands are omitted in the subsequent executions.
As the migration is logical i.e., it involves the data copy activity, the downtime starts when the data copy activity is initiated. It is highly recommended that the source database is set Read Only during the copy activity to avoid any data discrepancies during the data movement.
This step leverages SmartBulkCopy program to automate the migration. Follow the steps to initiate data migration:
An example of the config file can be seen here:
{ source": {
"connection-string": "Server=source_server.XXXX; Initial Catalog=your_database; Connection Timeout=90; User ID=XXXXX; Password=XXXXXXX"
},
"destination": {
"connection-string": "Server=destination_server.XXXX; Initial Catalog=your_database; Connection Timeout=90; User ID=XXXXX; Password=XXXXX"
},
"tables": {
"include": [ "*" ],
"exclude": []
},
"options": {
"tasks": 24,
"logical-partitions": "auto",
"batch-size": 1000000,
"truncate-tables": false,
"safe-check": "none",
"stop-if": {
"secondary-indexes": false,
"temporal-table": true
},
"retry-connection": {
"delay-increment": 10,
"max-attempt": 5
}
}
}
After the data copy activity is completed, the last remaining part is to recreate all objects (secondary indexes and foreign keys) dropped earlier. To do so execute the following command in the target database:
exec dbo.SmartBulkCopyMigrationMaintenance @execType = N'POSTMIG'
|
Depending on the data size, number of indexes and target database performance tier the procedure may take some time to complete.
Once the post migration procedure has been finished it would be recommended to verify if all commands completed successfully or if there was any error command recorded in the dbo.SmartBulkCopyErrorCmds table.
If there is anything recorded in the dbo.SmartBulkCopyErrorCmds it would be recommended to rerun the stored procedure (exec dbo.SmartBulkCopyMigrationMaintenance @execType = N'POSTMIG') again and it will try to (re)create the errored objects only.
The post migration validation step ensures that all the data has been migrated and the source and target schema are the same. The basic validation purely relies on the row and index count and should be sufficient if access to the source database is restricted during the migration and it’s guaranteed no row has been updated.
The thorough validation does more advanced validation but is also more time consuming.
This PowerShell script (available here) does a basic comparison between the source and target database. The following properties are compared:
The script highlights the table in red if any of the properties (rows count, index count, unique index count or foreign keys count) does not match the source and target.
The following parameters are used in the script (BasicValidation.ps1) and it can be downloaded here.
Parameter name |
Description |
$sourceSqlServer |
The source SQL Server.
|
$sourceDatabaseName |
The source SQL Server database
|
$sourceUser |
Source user. The login used to connect to the source SQL database. |
$sourcePw |
The password for the SourceUser account.
|
$targetSqlServer |
The target SQL Server.
|
$targetDatabaseName |
The target SQL Server database
|
$targetUser |
Target user. The login used to connect to the target SQL database. |
$targetPw |
The password for the TargetUser account.
|
The Database Compare application is a Windows console application, written in C#, that allow comparison of SQL Server to SQL Server databases or Oracle, Db2, MySQL, PostgreSQL, Teradata or Netezza databases to SQL Server. For each table processed, the application builds SQL select statements for both source and target platforms that combine the columns in a row of a table into a single string, which the query then hashes using an MD5 hash function. The MD5 hash algorithm was chosen because it is faster than newer hashes, but still has an excellent ability to detect differences in strings.
For detailed information on the tool and use cases as well as download link please see here
The time need to complete the migration depends on source and target SQL Server / database resources as well as the network throughput. It is recommended to test the migration using the database copy to estimate the time needed as well as ensure the migration can be completed without any problems.
The following generic recommendations can be used to ensure optimal migration performance:
SELECT * FROM sys.tables
WHERE lock_escalation_desc <> N'TABLE'OR lock_escalation_desc <> 'AUTO'
We hope that this post has helped you and described scenarios and options for using the solution to migrate your databases successfully. If you have feedback or suggestions for improving this data migration asset, please contact the Azure Databases SQL Customer Success Engineering Team. Thanks for your support!
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.