SOLVED

Power Query Editor unpack JSON

Copper Contributor

Hello.  I have a spreadsheet with 5 columns.  One of those columns contains a json array (the array also has nested arrays).  Is it possible to use Power Query Editor to parse the json array in each row?  So that  each row has the four original columns plus the new columns for each value in the Json array and sub arrays?  While the data in each JSON array may be  unique, the structure is always the same, though the number of elements/sub arrays fluctuates.   I have uploaded an example spread sheet.

6 Replies

@DreamDelerium Yes you can. Connect to the table with PQ, Select the column with the JSON code. Tra,nsform, Parse, JSON. Expande fields to new rows. Expend records, OK. Repeat expanding fields and records until all columns contain data only. No M-coding required, just pushing buttons in the correct order.

 

See attached.

@Riny_van_EekelenThank you!  Your steps have gotten me part way there.  But, I can't seem to follow your steps on the sub arrays.  These are the steps I have taken:

1.  Open Excel and Go To Data Tab

2.  Select Get Data->From File->From Workbook (and select the workbook with the raw data)

3.  In the Navigator window, select the Excel tab with the data and click the "Transform Data" button.  This opens the Power Query Editor

4.  Select the column with JSON data and right click, selecting Transform->Json.  This converts each record as a "list"

5.  Click the double arrow icon on the top right corner of the column header and select "Expand to new rows".  This converts the fields from "list" to "Record"

6.  Click the double arrow icon again, which opens a dialog with all the fields in the json array, and make sure all columns are selected an click OK.

7.  This expands all the json data into new columns.  Find the column with the second layer of json data and select it. 

8.  Each item in this column says "List".  But, when I right click, there is no "Transform" option. 

 

So, how do I do this for the second layer of JSON data?

 

 

I think I found the issue. On occasion, the sub array will have sublist:false instead of sublist:[]. Is there a way to skip the json formatting process for the sub array if it is "False"? Or, is this something I would need to do before bringing into Excel PQ?

@Riny_van_Eekelen  To make things a little easier to reproduce the error, I have attached a more concise example of the issue.  In this example, the field "sublist" can be:

 

sublist:[]
sublist:[{"option_1":1, "option_2: "aaa", option_3:""}]
sublist:false

In this example, if the sublist is "false" then it does not appear that I can parse the entire column.  Capture_ExcelEx.PNG

best response confirmed by allyreckerman (Microsoft)
Solution

I found this worked.  I first created a custom column.  Then added this as the custom forumla

 

 

=if [response_json.sublist] <> false then
Table.ToRecords(Table.FromRecords([response_json.sublist]))
else
Table.ToRecords(
Table.FromRecords({
[option_1 = "null", option_2= "null", option_3= "null"]
})
)

 

@DreamDelerium Thanks for taking the time to post your steps here, it's helpful. There doesn't seem to be much out there on parsing messy json data!

1 best response

Accepted Solutions
best response confirmed by allyreckerman (Microsoft)
Solution

I found this worked.  I first created a custom column.  Then added this as the custom forumla

 

 

=if [response_json.sublist] <> false then
Table.ToRecords(Table.FromRecords([response_json.sublist]))
else
Table.ToRecords(
Table.FromRecords({
[option_1 = "null", option_2= "null", option_3= "null"]
})
)

 

View solution in original post