Forum Discussion
Understand Data Factory Data Flow REST source behaviour when the response is json array
{
"page": 2,
"per_page": 6,
"total": 12,
"total_pages": 2,
"data": [
{
"id": 7,
"email": "email address removed for privacy reasons",
"first_name": "Michael",
"last_name": "Lawson",
"avatar": "https://reqres.in/img/faces/7-image.jpg"
},
{
"id": 8,
"email": "email address removed for privacy reasons",
"first_name": "Lindsay",
"last_name": "Ferguson",
"avatar": "https://reqres.in/img/faces/8-image.jpg"
},
],
"support": {
"url": "https://reqres.in/#support-heading",
"text": "To keep ReqRes free, contributions towards server costs are appreciated!"
}
}
As you can see, the response payload is a Json Object with a data array containing the actual data content.
I need to unroll the data field and save it in a csv file using a Data Flow. That's pretty straightforward:
- Source Projection is defining a field named data of type complex array inside the body column:
- Let's now unroll by data field and use an explicit mapping:
- Just make data flow into the Sink with the auto-map feature:
Now, let's change our source, with a slightly different response (https://jsonplaceholder.typicode.com/posts)
[
{
"userId": 1,
"id": 1,
"title": "sunt aut facere repellat",
"body": "quia et suscipit suscipit recusandae consequuntur expedita"
},
{
"userId": 1,
"id": 2,
"title": "qui est esse",
"body": "est rerum tempore vitae"
},
{
"userId": 1,
"id": 3,
"title": "ea molestias",
"body": "et iusto sed quo"
},
{
"userId": 1,
"id": 4,
"title": "eum et est occaecati",
"body": "ullam et saepe reiciendis voluptatem adipisci"
}
]
In this case, the response body is just a Json Array. This particular detail, makes it impossible to reach the same goal as before.
The projection shows the body which is defined as a complex object and not as a complex array as I would expect.
In fact, unrolling data will be not allowed (the option is grayed out)
Also, trying to directly write the body field into a csv sink will result in a validation error:
Eventually, I found a workaround where you should use the Aggregate task combined with the collect() function. But, it is just a lot of overhead.
I still think I'm missing something, but it seems that Data Factory Data Flows don't support Json Array object as a response payload from a REST source.