Transform complex JSON structures from CosmosDB to SQL DB with Azure Data Factory
Published Mar 10 2020 03:54 PM 6,686 Views
Microsoft

 

  1. Start with this JSON collection in ADF based on this Orders dataset. This data has orders from Northwind with order header and order details embedded in a single document per order.
  2. You can load it into CosmosDB as the video above explains, or start with a JSON file in Blob or ADLS as your source in ADF data flow.
  3. In ADF Data Flow, add a Flatten transformation following your source and choose "Details" as the array to unroll. Leave unroll root as empty/default.
  4. This will produce a new structure called "Details". Use a Derived Column next to add a new property to that struct called "totalAmount".
  5. In the Derived Column, choose the root level "Details" field. Click on Expression Builder.
  6. In the middle Functions pane, select the icon next to "Details" to add the expression structure automatically.
  7. in the left-side "Input Schema" pane, click the plus sign on the bottom of the structure and add a new column. Call it "totalAmount"
    h2h.png
  8. The formula for totalAmount is 
    toString(round(toInteger(details.quantity)*toInteger(details.unitPrice),2),'###.##')​
  9. Now you're ready to sink your data to an Azure SQL DB table. In the Sink, set the dataset to Azure SQL DB.
  10.  Set the mapping to look like this:
     h2h2.png
  11. You can leave all of the root-level k/v fields set as they are by default. Choose the individual properties from each structure that you wish to map to a database table column.

The full online documentation for this ADF transformation can be found here.

 

You can download this sample pipeline demo using the ADF Pipeline Template here. Install it in your factory by going to New > Pipeline from template > import template.

 

 

2 Comments
Copper Contributor

The current scenarios allow you to convert Arrays or Complex Types (JSON) into single columns (as mentioned in this post) or convert columns into JSON.  However, is there a way to convert the JSON back to the string representation for storage into a SQL Column?

Example:

Source:

{
    "id": 1234,
    "data": {
        "name": "Acme, Inc.",
        "address": {
            "city": "Seattle",
            "state": "WA",
            "zip": "98195"
        }
    },
    "additionalData": {
        "color": "red",
        "shape": "square",
        "number": 4
    }
}

In this case, I don't want to store color, shape, and number as distinct columns, but as text: { "color": "red", "shape": "square", "number": 4 }.

 

Is this possible currently, or something coming in the future?

Copper Contributor

I decided to move this to a conversation, but I couldn't delete my comment.  Sorry for the confusion.

Version history
Last update:
‎Mar 10 2020 03:54 PM
Updated by: