Forum Discussion

Fabio Carello's avatar
Fabio Carello
Copper Contributor
Oct 19, 2023

Understand Data Factory Data Flow REST source behaviour when the response is json array

I’m trying to understand how the REST source works in a data factory Data Flow. I will start with a simple case. This is a sample response of my API (https://reqres.in/api/users)
 

 

{
    "page": 2,
    "per_page": 6,
    "total": 12,
    "total_pages": 2,
    "data": [
        {
            "id": 7,
            "email": "email address removed for privacy reasons",
            "first_name": "Michael",
            "last_name": "Lawson",
            "avatar": "https://reqres.in/img/faces/7-image.jpg"
        },
        {
            "id": 8,
            "email": "email address removed for privacy reasons",
            "first_name": "Lindsay",
            "last_name": "Ferguson",
            "avatar": "https://reqres.in/img/faces/8-image.jpg"
        },
    ],
    "support": {
        "url": "https://reqres.in/#support-heading",
        "text": "To keep ReqRes free, contributions towards server costs are appreciated!"
    }
}

 

 

As you can see, the response payload is a Json Object with a data array containing the actual data content.

I need to unroll the data field and save it in a csv file using a Data Flow. That's pretty straightforward:

  • Source Projection is defining a field named data of type complex array inside the body column: 
     

     
  • Let's now unroll by data field and use an explicit mapping: 
     

  • Just make data flow into the Sink with the auto-map feature: 
     

 Now, let's change our source, with a slightly different response (https://jsonplaceholder.typicode.com/posts)

 

 

[
  {
    "userId": 1,
    "id": 1,
    "title": "sunt aut facere repellat",
    "body": "quia et suscipit suscipit recusandae consequuntur expedita"
  },
  {
    "userId": 1,
    "id": 2,
    "title": "qui est esse",
    "body": "est rerum tempore vitae"
  },
  {
    "userId": 1,
    "id": 3,
    "title": "ea molestias",
    "body": "et iusto sed quo"
  },
  {
    "userId": 1,
    "id": 4,
    "title": "eum et est occaecati",
    "body": "ullam et saepe reiciendis voluptatem adipisci"
  }
]

 

 

In this case, the response body is just a Json Array. This particular detail, makes it impossible to reach the same goal as before.

  • The projection shows the body which is defined as a complex object and not as a complex array as I would expect.

 

  • In fact, unrolling data will be not allowed (the option is grayed out) 

     

  • Also, trying to directly write the body field into a csv sink will result in a validation error: 

     

Eventually, I found a workaround where you should use the Aggregate task combined with the collect() function. But, it is just a lot of overhead.

I still think I'm missing something, but it seems that Data Factory Data Flows don't support Json Array object as a response payload from a REST source.

No RepliesBe the first to reply

Resources