Blog Post

Modernization Best Practices and Reusable Assets Blog
6 MIN READ

Resolving ADF Copy Upsert Failures in SQL DB for XML Columns

skhandelwal's avatar
skhandelwal
Icon for Microsoft rankMicrosoft
Nov 12, 2024

How to copy incremental Db2 database table data with XML columns to Azure SQL Database table.

Introduction

Azure SQL (SQL DB / SQL MI / SQL VM) provides performance, high reliability, inbuilt disaster recovery and availability as well as advanced features such as intelligent query processing, automatic performance tuningserverless compute, and copilot for query generation. It supports different deployment choices, service levels, language SDKs, data types, and more.

Azure Data Factory (ADF) is a powerful cloud-based data integration service that facilitates seamless data migration from Db2 / Oracle / Sybase and lot of other data sources to Azure SQL. By leveraging ADF's built-in connectors, users can easily create pipelines to extract data from Db2 / Oracle / Sybase and other sources, transform it as needed, and load it into Azure SQL. The process is automated, scalable, and efficient, allowing for secure data transfer across different environments. ADF also supports incremental loading and error handling, ensuring data integrity and minimizing downtime during the migration process. This solution enables enterprises to modernize mainframe/  midrange by moving critical Db2 workloads to the Azure cloud with ease.

In many data migration or integration scenarios, the customer first uses a full data load approach to transfer the entire dataset from the source (e.g. Db2) system to the destination Azure SQL, ensuring that the target system is fully synchronized.

Once the full load is completed, the process transitions to an incremental load phase, where only the changes (new, updated, or deleted records) from the source are regularly synced to the target. This approach minimizes data transfer time and resource usage, ensuring that the target system remains up to date without the need to reload the entire dataset repeatedly. The incremental load can be performed into Azure SQL Database using Azure Data Factory's (ADF) Copy activity with the UPSERT option. This ensures that only the modified data from the source system (such as new or updated records) is transferred and integrated into the destination table. The UPSERT operation effectively updates existing records if they already exist or inserts new records if they don’t.

ADF Copy activity Upsert process does not work when the target database table has XML columns. Data copy fails with the error:

"Operation on target Db2ToSQLDB Upsert Resolution failed: ErrorCode=SqlOperationFailed,'Type=Microsoft.Data.SqlClient.SqlException, Message= The data type xml cannot be used as an operand to the UNION, INTERSECT or EXCEPT operators because it is not comparable.,Source=Framework Microsoft SqlClient Data Provider,"

 

In this technical blog, we have explained how to copy incremental Db2 database table data with XML columns to Azure SQL Database table. This process is same even when you want to perform XML data copy from Oracle / Sybase and other sources to target Azure SQL.

Test schema / data setup on Mainframe Db2 and Azure SQL

Test Schema / Data Setup on Db2:

Create table on Db2 on Mainframe with XML data type as one of the columns as shown below:

CREATE TABLE DEMO.EmployeeDetails (
    EmployeeID INT NOT NULL PRIMARY KEY,
    EmployeeName VARCHAR(100),
    Department VARCHAR(50),
    ProfileDetails XML
);

Insert test Data into this Db2 table:

INSERT INTO DEMO.EmployeeDetails (EmployeeID, EmployeeName, Department, ProfileDetails)
VALUES (1, 'John Barnes', 'IT', '<EmployeeProfile><Address>456 Oak Avenue, Riverside, CA 92507</Address><Phone>123-456-7890</Phone></EmployeeProfile>');

INSERT INTO DEMO.EmployeeDetails (EmployeeID, EmployeeName, Department, ProfileDetails)
VALUES (2, 'Mark Johnson', 'HR', '<EmployeeProfile><Address>Frederiskerg, Denmark</Address><Phone>111-222-3333</Phone></EmployeeProfile>');

INSERT INTO DEMO.EmployeeDetails (EmployeeID, EmployeeName, Department, ProfileDetails)
VALUES (3, 'Sophia Carter', 'IT', '<EmployeeProfile><Address>123 Maple Street, Springfield, IL 62704</Address><Phone>123-222-3333</Phone></EmployeeProfile>');

Test Schema on Azure SQL DB:

Create table on Azure SQL DB where data from Mainframe Db2 table created above will be copied.

CREATE SCHEMA DEMO;
CREATE TABLE DEMO.EmployeeDetails (
    EmployeeID INT NOT NULL PRIMARY KEY,
    EmployeeName VARCHAR(100),
    Department VARCHAR(50),
    ProfileDetails XML
);

Note: Db2 table schema can be converted to Azure SQL DB schema using free Microsoft SQL Server Migration Assistant (SSMA) Software.  More information about SSMA is present location : link.

Copy Data from Db2 to Azure SQL DB using ADF and Sink option Upsert enabled as mentioned below:

  1. Copy activity with Source as Db2 zOS and Sink as Azure SQL Database
  2. Source Db2 dataset using Linked Service to connect to Db2
  3. Custom SELECT query to perform selective data transfer from Db2. Here we have shown sample query to copy all data. For incremental load you will use specific WHERE clause to copy data only after specific date / time or specific id value.

Below are the Sink settings for this copy activity:

  1. Sink Azure SQL Database dataset using Linked Service to connect to Azure SQL DB.
  2. Enable Upsert to Insert rows if not present in target or Update rows if present in target for a given Key column.
  3. Key column to be used to identify if Row from Db2 is matching row from Azure SQL DB. If row is matching Update entry in Azure SQL DB otherwise Insert entry into Azure SQL DB. 

Run or Debug the pipeline

Run or debug the pipeline from ADF User Interface. As shown below (1) Copy activity is failing with error: The data type xml cannot be used as an operand to UNION, INTERSECT or EXCEPT as internally ADF is using one of these operands to perform UPSERT.

Solution to UPSERT for XML column issue:

Instead of using Write behavior Upsert on Sink side we will use Stored Procedure as Write behavior. So, while writing to Sink, Stored procedure will be called to perform write on target. Source table rows will be sent as table valued parameter to Stored procedure. In this Stored procedure we will write MERGE sql statement to perform Insert to target SQL DB if row not present or Update if row already present. This is explained in detail below:

  1. Write behavior set as Stored procedure to call Stored procedure for inserting data in Sink database.
  2. Name of Stored procedure
  3. Table Value Parameter Type to be passed to the Stored Procedure
  4. Input parameter name of the Stored procedure which will receive rows of data

Definition of each of the components mentioned above is shown below:

Table Valued Parameter

 

CREATE TYPE DEMO.EmployeeDetailsType AS TABLE (
    EmployeeID INT NOT NULL PRIMARY KEY,
    EmployeeName VARCHAR(100),
    Department VARCHAR(50),
    ProfileDetails XML
);

Stored Procedure Definition

CREATE PROCEDURE DEMO.UpsertEmployeeDetails
    @EmployeeDetailsTVP EmployeeDetailsType READONLY
AS
BEGIN
    -- Perform Upsert (Insert or Update)
    MERGE INTO DEMO.EmployeeDetails AS Target
    USING @EmployeeDetailsTVP AS Source
    ON Target.EmployeeID = Source.EmployeeID
    
    -- When match is found, update the existing record
    WHEN MATCHED THEN
        UPDATE SET
            Target.EmployeeName = Source.EmployeeName,
            Target.Department = Source.Department,
            Target.ProfileDetails = Source.ProfileDetails
    
    -- When no match is found, insert the new record
    WHEN NOT MATCHED THEN
        INSERT (EmployeeID, EmployeeName, Department, ProfileDetails)
        VALUES (Source.EmployeeID, Source.EmployeeName, Source.Department, Source.ProfileDetails);

END

The UpsertEmployeeDetails stored procedure takes a table-valued param(@EmployeeDetailsTVP) containing employee details and performs an upsert operation on the EmployeeDetails table in the DEMO schema. Using the MERGE statement, it is matching records between the source (incoming data) and the target table (existing data). If a matching record is found based on EmployeeID, it updates the existing employee details; if no match is found, it inserts a new record with the provided employee information. This allows for efficient handling of both updates and inserts in a single operation.

Execute copy activity with Write behavior as Stored procedure as explained above:

As shown above (1), now we can update / insert (i.e. Upsert) entries into target Azure SQL Database even for table with XML column.

Check number of entries added to target Azure SQL Database:

As shown above, the number of entries copied from Db2 to Azure SQL DB are 3. All these entries are Insert operations on target as these records were not present in Azure SQL DB.

Let’s now update ProfileDetails for EmployeeID 1 on source Db2.

UPDATE DEMO.EmployeeDetails SET ProfileDetails = '<EmployeeProfile><Address>204 Birch Lane, Orlando, FL 32801</Address><Phone>456-222-3333</Phone></EmployeeProfile>'
WHERE EmployeeID = 1;

Rerun the pipeline and check if EmployeeID =1 is properly updated in Azure SQL DB

Check entries in Azure SQL Database:

As shown above EmployeeID 1 ProfileDetails are updated with updated Address in Db2 i.e. 204 Birch Lane, Orlando, Fl 32801.

Summary

Customers modernizing their Db2 / Oracle / Sybase and other sources data to Azure SQL often use Azure Data Factory (ADF) for data transfer. For large datasets, an initial copy is made a few days prior, followed by incremental data copies during cutover. The ADF Copy activity's Upsert method is common for these incremental transfers, but this method fails for tables which contain XML columns. This blog explains using Stored Procedure as the Sink behavior in Copy activity to perform Upsert into Azure SQL Database. Same method can be used even when target is Azure SQL MI or Azure SQL VM.

Feedback and suggestions

If you have feedback or suggestions for improving this data migration asset, please send an email to Database Platform Engineering Team.

Updated Nov 12, 2024
Version 1.0
No CommentsBe the first to comment