Copying Azure Table Storage using Azure Data Factory (ADF)
Published Jan 24 2022 06:25 PM 11.8K Views
Microsoft

Background:

Many times we have a requirement to migrate azure storage tables from one storage account to another. For a small number of tables, you can choose the following options:

 

For storage accounts containing large number of tables, we can also use Azure Data Factory (ADF). This blog demostrates how we can use ADF for copying storage tables across 2 storage accounts.

 

1. Storage Accounts:

In this blob, we will be moving storage tables from a source account to destination storage account. In the source storage account we currently have 10 tables:

ankitsharma_2-1642237094070.png

 

These tables are populated with some sample data:

ankitsharma_3-1642237282466.png

 

In the destination, there are no tables.

ankitsharma_1-1642237031519.png

 

2. SAS Tokens

As we need to make REST API calls to both the storage accounts, we will be using SAS tokens to authorize the calls. 

 

We will create a service SAS token for accessing both storage accounts. You can fine tune the permissions if required. I will be creating SAS using following permissions for both accounts:

ankitsharma_0-1642239888689.png

 

3. Creating ADF

We will now create an Azure Data Factory (ADF) resource. If you are new to ADF, please refer this link on how to create one: https://docs.microsoft.com/en-us/azure/data-factory/quickstart-create-data-factory-portal#create-a-d...

 

Once created we can access the Azure Data Factory UI page using the link on Overview as shown below:

ankitsharma_1-1642240335374.png

 

4. Authoring the pipelines

We will create our pipeline that we can use to setup our copy process. 

ankitsharma_3-1642240543792.png

 

We will add two parameters in this pipeline, to save our SAS tokens. We can use these parameters in the pipeline. Once expired, we can update the parameter values.

ankitsharma_6-1642241252168.png

 

5. Creating the activities

Now that the pipeline is created we will add the activities for the copy process. The flow for this process is as follows:

  1. Get the list of all tables from source storage account.
  2. Loop through the tables and perform:
    1. Create table operation on the destination storage account.
    2. Copy operation from source to destination.

 

5.1 List tables from source storage account

We will use a Web Activity for this operation. Using this activity we will make a REST API call to storage service with SAS token to fetch the list of tables. The API being used is Query Tables.

 

In the URL we are providing the source storage endpoint as per the API specification, concatenating with the sourceSAS parameter we saved earlier.

@concat('https://SOURCE.table.core.windows.net/Tables', pipeline().parameters.sourceSAS)
To learn more about expressions and functions in ADF please refer this documetation: https://docs.microsoft.com/en-us/azure/data-factory/control-flow-expression-language-functions
 
As we need the response in JSON, we are passing the 'Accept' header with 'application/json'.

ankitsharma_8-1642241563296.png

 

Once configured, we can publish the changes.
ankitsharma_9-1642241623918.png

 

5.2 Reading the output and looping through tables

Add a ForEach activity, which will be used to loop through list of tables received from REST API.

ankitsharma_10-1642241809928.png

 

In the Items setting, we will use the output of the previous activity for this loop. We have added ".value' as this is property that contains the array of tables returned by REST API.

 

5.3 Creating Tables in destination account

The rest of the activities will be created inside ForEach activity.

ankitsharma_11-1642241980099.png

 

We will use another Web Activity to call the Create Table REST API on the destination storage account. Configure the Web Activity according to Create Table REST API specification.

ankitsharma_13-1642242568670.png

 

We are passing the table name in Body dynamically using the 'item' object from ForEach activity.

 

5.4 Copy Data

The next item is 'Copy data' activity, which will peform the data copy between both storage accounts. Copy data activity will require a Dataset and Linked Service to connect to storage account. For datasets, click on on 'New' for both source & sink(destination) and select 'Azure Table Storage' to continue. Provide a relevant name.

 

ankitsharma_0-1642246283581.png

 

You can follow the steps shared in this documents to create linked service for both of your storage accounts: https://docs.microsoft.com/en-us/azure/data-factory/connector-azure-table-storage?tabs=data-factory#...

 

Once you have created the datasets & linked service for both source & sink account, we need to configure the datasets to pass the Table names dynamically. Use the "item" object of ForEach activity to get the table name, for both source and destination(sink) datasets as shown below:

Create a parameter in both datasets:

ankitsharma_1-1642244598358.png

Use the parameter as table name in dataset:

ankitsharma_2-1642244627585.png

 

Pass the current item to the dataset from Copy data activity:

ankitsharma_3-1642244671602.png

 

Tip: You can further optimize your ADF to use storage account name, SAS tokens etc. dynamically using parameters and variables. To learn more on the same refer this documentation: https://azure.microsoft.com/en-in/resources/azure-data-factory-passing-parameters/

 

6. Run the activity

We can now run the activity to start the copy process. The pipeline will complete depending on the number of tables and data in the source storage account.

ankitsharma_4-1642245339974.png

 

We can see that tables are now created in the destination storage account:

ankitsharma_5-1642245402700.png

 

And the data is copied too.

ankitsharma_6-1642245492552.png

 

Tip: You can also add a Delete Table API request after the Copy data activity to delete the table in source storage account.

 

Update:

To copy source tables PartitionKey and RowKey, use the below configuration in Copy Data activity:

 

ankitsharma_0-1658841314107.png

 

 

 

4 Comments
Co-Authors
Version history
Last update:
‎Jul 26 2022 06:16 AM
Updated by: