ADF Adds Hierarchical & JSON Data Transformations to Mapping Data Flows

Microsoft

Sep 28, 2019

The Azure Data Factory team has released JSON and hierarchical data transformations to Mapping Data Flows. With this new feature, you can now ingest, transform, generate schemas, build hierarchies, and sink complex data types using JSON in data flows.

In the sample data flow above, I take the Movies text file in CSV format, generate a new complex type called "Movies" that contains each of the attributes of the incoming CSV file.

By using the Expression Builder (see above) inside of a Derived Column transform, I can define a new hierarchical structure that includes arrays. Genres is a multi-value field from the source CSV that contains 1:n genre categories for movies via a pipe (|) delimited field. I can use the ADF data flow expression language to split all values from that string and store each value as an array element.

Mapping Data Flows now has the ability to show metadata for complex types including hierarchies and arrays in both metadata Inspect views as well as in Data Preview.

Now that I've transformed my flat file into a hierarchical JSON format, I can sink this as a new JSON file to my Blob or lake using a JSON Sink. To create this new file, I'll map just the new complex types by selecting the top-level "Movies" hierarchy.

JSON datasets can also now become source files for Data Flows. In this case, I'll read in my new JSON dataset from the above example and perform some transformations. I want to upper case each of the genres. To do this, I can apply a single, simple expression in a Derived Column, that will upper case each element in my Genres array that I created previously.

We've posted a helper document on our ADF documentation site to help you work through more examples of working with hierarchies, arrays, and JSON datasets in ADF Mapping Data Flows here.

Published Sep 28, 2019

Version 1.0

Microsoft

Joined August 14, 2018

View Profile

Azure Data Factory Blog

Follow this blog board to get notified when there's new activity

5 Comments

kamal600
Microsoft
Aug 20, 2021
I am looking to transform json blob data to relational format using ADF as follows :
what is the way to do in ADF?
"reseller": {

                "en-ca": [

                    {

                        "MarketPlaceResellerId": "1",

                        "ResellerPrice": "",

                        "ResellerFormattedPrice": "",

                        "InventoryStatus": 3

                    }

                ],

                "en-us": [

                    {

                        "MarketPlaceResellerId": "2",

                        "ResellerPrice": "",

                        "ResellerFormattedPrice": "",

                        "InventoryStatus": 4

                    }

                ],

                "fr-ca": [

                    {

                        "MarketPlaceResellerId": "1",

                        "ResellerPrice": "",

                        "ResellerFormattedPrice": "",

                        "InventoryStatus": 3

                    }

Output :



Market

MarketPlaceResellerId

ResellerPrice

ResellerFormattedPrice

InventoryStatus

En-ca

1

3

En-us

2

4

RahulSangwan

Copper Contributor

Nov 25, 2019

Hi Mark, I have nested JSON File which contains data in array format also. While mapping I am getting error JSON Path inavalid at $[ResourceManagementProcess\\'sRunId] due to escape character. I am using copy activity. Is there any other way to convert JSON to CSV format as data flow is not option to choose.

{
    "id": "ed0e4960-d9c5-11e6-85dc-d7996816aad3",
    "context": {
        "device": {
            "type": "PC"
        },
        "custom": {
            "dimensions": [
                {
                    "TargetResourceType": "Microsoft.Compute/virtualMachines"
                },
                {
                    "ResourceManagementProcess'sRunId": "827f8aaa-ab72-437c-ba48-d8917a7336a3"
                },
                {
                    "OccurrenceTime": "1/13/2017 11:24:37 AM"
                }
            ]
        }
    }
}

Mark Kromer
Microsoft
Nov 25, 2019
In the Copy Activity, you can map these structures to targeted column names in your CSV flat file in the Sink mapping.

In Data Flows, you will create a Derived Column and build a flat structure of columns that map to the different structures.

Data Flows are enhancing the H2R capabilities in the tool, but for now, this is the mechanism to use.
Mukesh_Dutta
Copper Contributor
Nov 23, 2019
Can you help us transform the following sample JSON to csv using ADF and/or Data Flows.
{
"latest": "2019-10-10T11:09:09+00:00",
"employees": {
"5487": {
"id": "5487",
"action": "Updated",
"lastChanged": "2019-10-10T11:08:53+00:00"
},
"5491": {
"id": "5491",
"action": "Updated",
"lastChanged": "2019-10-10T11:09:01+00:00"
},
"5525": {
"id": "5525",
"action": "Updated",
"lastChanged": "2019-10-10T11:09:09+00:00"
}
}
}
MartinZurita
Copper Contributor
Sep 30, 2019
Super handy feature to avoid doing extra steps to deal with this kind of situations!! Thanks for adding this one.

Market	MarketPlaceResellerId	ResellerPrice	ResellerFormattedPrice	InventoryStatus
En-ca	1			3
En-us	2			4

Blog Post

ADF Adds Hierarchical & JSON Data Transformations to Mapping Data Flows