Introduction
In today’s data-driven landscape, the migration of databases is a crucial task that requires meticulous planning and execution. Our recent project for migrating data from Sybase ASE to MSSQL set out to illuminate this process, using tools like Microsoft SQL Server Migration Assistant (SSMA), Microsoft SQL Server Integration Service (SSIS) packages, and Azure Data Factory (ADF).
We carefully designed our tests to cover a range of database sizes: 1GB, 10GB, 100GB, and 500GB. Each size brought its own set of challenges and valuable insights, guiding our migration strategies for a smooth transition.
For the smaller databases (1GB and 10GB), we utilized SSMA, which demonstrated strong efficiency and reliability in handling straightforward migrations. SSMA was particularly effective in converting database schemas and moving data with minimal complication. As we scaled up larger datasets (100GB and 500GB), we incorporated SSIS packages alongside Azure Data Factory to address the increased complexity and ensure robust performance throughout the migration process.
Our exercises yielded important findings related to performance metrics such as data throughput, error rates during transfer, and overall execution times for each migration approach. These insights helped us refine our methodologies and underscored the necessity of selecting the right tools for each migration scenario when transitioning from Sybase to MSSQL.
Ultimately, our experience highlighted that thorough analysis is essential for identifying potential bottlenecks and optimizing workflows, enabling successful and efficient migrations across databases of all sizes. The results reassure stakeholders that, with a well-considered approach and comprehensive testing, migrations can be executed seamlessly while maintaining the integrity of all datasets involved.
There are 2 parts in the solution. Part 1 covers SSMA and SSIS. Part 2 covers the Azure Data Factory (ADF), tests results, performance improvement guidelines and conclusion.
Azure Data Factory (ADF)
Migration steps:
- Set up an Azure Data Factory instance and Integration Runtime (IR) for on-premises connectivity.
- Create pipelines with Copy data activities, mapping Sybase ASE tables to SQL Server/Azure SQL.
- Configure parallel copy settings and staging storage (Azure Blob Storage) for large datasets (if required).
- Monitor pipeline execution and retry failed activities.
Test Results:
|
Test Scenario |
Data Size |
Time Taken |
Throughput |
Threads |
Notes |
|
Single Copy |
1 GB |
40 seconds |
~25 Mbps |
1 |
Single Threaded |
|
Single Copy |
10 GB |
4 minutes 22 seconds |
~40 Mbps |
1 |
Single Threaded |
|
Parallel Copy |
100 GB |
39 minutes |
~44 Mbps |
16 |
Parallel Threads |
|
Scalability Test |
500 GB |
3 hours 10 minutes |
~45 Mbps |
16 |
Parallel Threads - ADF scaled better due to cloud elasticity |
Key learnings:
- Cloud-native, scalable: handled 500 GB with ease
- Parallel copy and batch tuning make it faster as volume increases
- Auto-scaling prevents many typical OOM (Out of Memory) errors seen with SSMA
- Recommended for cloud targets (Azure SQL MI, Azure SQL DB)
Performance Tuning Insights:
In our setup we added a lookup, a foreach and one copy data within our pipeline.
Lookup: In the Settings tab, select the SQL dataset as the source. Instead of choosing the entire table, go to the “Use query” section and provide the query shown below. This query uses a recursive Common Table Expression (CTE) to dynamically generate partition ranges across a large integer sequence (representing the total number of rows).
Here is a sample output of this query for top 20 rows–
ForEach: Use this query in the Pipeline expression builder which is a reference to the Lookup output in our pipeline.
Copydata: Use below query at source which allows you to dynamically query different partitions of a large table (like Table_1_100GB) in parallel or sequentially.
On Sink, change “write batch size” and “Max concurrent connections as per the rowset and assigned memory. Optimizing these values can improve performance and reduce overhead during data transfers. For small rows, increase writeBatchSize to reduce batch overhead and improve throughput. For large rows, use a smaller value to avoid memory or database overload. If the source data exceeds the specified batch size, ADF processes data in multiple batches automatically.
Additionally adjust “Maximum data integration unit” and “Degree of copy parallelism”. A Data Integration Unit (DIU) is a measure that represents the power of a single unit in Azure Data Factory and Synapse pipelines. Power is a combination of CPU, memory, and network resource allocation. DIU only applies to Azure integration runtime. DIU doesn't apply to self-hosted integration runtime. While Degree of copy parallelism determines how many parallel copy activities can run simultaneously, optimizing the data transfer process.
Cumulative Tests Result
Disclaimer: All test results published herein are provided solely for reference purposes and reflect performance under ideal conditions within our controlled environment. Actual performance in the user's environment may vary significantly due to factors including, but not limited to, network speed, system bottlenecks, hardware limitations, CPU cores, memory, disk I/O, firewall configurations, and other environmental variables. On the source and target databases also multiple performance optimization and configuration adjustments have been implemented to enhance the migration efficiency. We strongly recommend that users conduct their own testing to determine performance under their specific conditions.
|
Data Size |
Best Performer |
Observation |
|
1 GB |
All tools performed efficiently |
Minimal overhead; no significant performance difference. |
|
10 GB |
SSIS |
Optimized batch processing led to better performance. |
|
100 GB |
SSIS and ADF |
Both benefited from parallelism. |
|
500 GB |
ADF and SSIS |
ADF's scalability and retry mechanisms proved valuable. SSIS was equivalent with tuned data flow components. |
|
Data Volume |
Row Count |
SSMA Time / Speed |
SSIS Time / Speed |
ADF Time / Speed |
|
1 GB |
262,144 |
28 sec / ⚡ 36 MB/s |
31 sec / ⚡33 MB/s |
40 sec / ⚡25 MB/s |
|
10 GB |
2,621,440 |
4 min 36 sec / ⚡37 MB/s |
4 min 16 sec / ⚡40 MB/s |
4 min 22 sec / ⚡40 MB/s |
|
100 GB |
26,214,400 |
44 min / ⚡38 MB/s |
38 min / ⚡44 MB/s |
39 min / ⚡44 MB/s |
|
500 GB |
26,214,400 x 5 |
3 hr 44 min / ⚡38 MB/s |
3 hr 12 min / ⚡44 MB/s |
3 hr 10 min / ⚡45 MB/s |
Performance Improvement Guidelines
- Use SSMA for Schema Conversion: SSMA is the primary tool for automating schema conversion. Customize data type mappings (e.g., Sybase DATETIME to SQL Server DATETIME2) and handle case-sensitive databases carefully.
- Leverage SSIS for Complex Data Transformations: Use SSIS for tables with non-compatible data types or when business logic requires transformation. Optimize performance with parallel tasks and appropriate buffer settings.
- Use ADF for Large-Scale or Hybrid Migrations: ADF is ideal for large datasets or migrations to Azure SQL Database. Use staging storage and parallel copy to maximize throughput. Ensure stable network connectivity for on-premises to cloud transfers.
Tips to improve ADF performance:
- Use staging areas (e.g., Azure Blob Storage) to offload source systems and speed up data transfers.
- Enable parallel copy in Copy Activity to increase throughput.
- Use the Integration Runtime closest to your data source to reduce network latency.
- Enable data partitioning on large tables to parallelize read/write operations.
- Adjust degree of parallelism to match your compute capacity.
- Use Self-Hosted IR or Azure IR with higher compute for large or complex migrations.
- Enable Auto-Scaling where supported to handle spikes efficiently.
- Monitor IR utilization to avoid under-provisioning or over-provisioning.
Please refer below links for more details related to ADF –
Sybase ASE to Azure SQL full and incremental data copy using ASE Transfer Table Tool and ADF
Db2 to Azure SQL fast data copy using ADF
- Migration Readiness Testing: Conduct performance testing on production-scale environments prior to the actual migration to obtain an accurate baseline of system behavior and identify potential bottlenecks under real workload conditions.
- Hybrid Approach: Combine SSMA for schema conversion, SSIS for complex data migrations, and ADF for orchestration in large-scale or cloud-based scenarios. For example, use SSMA to convert schemas, SSIS to migrate problematic tables, and ADF to orchestrate the overall pipeline.
- Validation: Post-migration, validate data integrity using checksums or row counts and test stored procedures for functional equivalence. Use SQL Server Management Studio (SSMS) for debugging. In the end we can use Microsoft Database Compare Utility that allows comparison of multiple source and target databases.
Challenges and Mitigations
- Sybase-Specific Syntax: SSMA may fail to convert complex stored procedures with Sybase-specific T-SQL. Manually rewrite these using SQL Server T-SQL.
- LOB Data: Large Object (LOB) data types (e.g., TEXT, IMAGE) may cause truncation errors. Map to NVARCHAR(MAX) or VARBINARY(MAX) and validate data post-migration.
- Network Latency in ADF: For on-premises to Azure migrations, ensure high-bandwidth connectivity or use Azure ExpressRoute to minimize latency.
- Case Sensitivity: Sybase ASE databases may be case-sensitive, while SQL Server defaults to case-insensitive. Configure SQL Server collations (e.g. SQL_Latin1_General_CP1_CS_AS) to match source behavior.
Conclusion
SSMA, SSIS, and ADF each offer unique strengths for migrating Sybase ASE to SQL Server, Azure SQL Database or Azure SQL Managed Instance. SSMA excels in schema conversion, SSIS in complex data transformations, and ADF in scalability and cloud integration. A hybrid approach, leveraging SSMA for schema conversion, SSIS for problematic data, and ADF for orchestration, often yields the best results. Evaluation shows ADF’s superior scalability for large datasets, while SSIS provides flexibility for complex migrations.
Proper planning, including schema assessment, data type mapping, and performance tuning, is critical for a successful migration. For further details refer to Microsoft’s official documentation:
SSMA for Sybase: SQL Server Migration Assistant for Sybase (SybaseToSQL) - SQL Server | Microsoft Learn
SSMA Project Settings: Project Settings (Migration) (SybaseToSQL) - SQL Server | Microsoft Learn
SSIS: SQL Server Integration Services - SQL Server Integration Services (SSIS) | Microsoft Learn
ADF: Azure Data Factory - Data Integration Service | Microsoft Azure
Feedback and suggestions
If you have feedback or suggestions for improving this data migration asset, please contact the Databases SQL Customer Success Engineering (Ninja) Team (datasqlninja@microsoft.com). Thanks for your support!
Note: For additional information about migrating various source databases to Azure, see the Azure Database Migration Guide.