Forum Discussion

bones_clarke's avatar
bones_clarke
Copper Contributor
Apr 01, 2024

Specific Use Case: REST API Pagination in Data Factory

Hello, I seem to have a specific use case in regards to ingesting data from an REST API endpoint and struggling on how to use pagination within the source instead of using the Until function. I got the Until function to work and it cycles through my pages, but the issue is that it creates a new document per page when I want all the information consolidated into one file/blob.

 

For my REST API endpoint, I have a base url that doesn't change and a relative url that uses a start page and a count. The start page is the obvious page to start the call on and the count is the number of records it will return. I have set these up as parameters in the source with start page = 1 and count = 400. For this particular call, using the Until function results in 19 separate pages of 400 by adding '1' to the start page for each call until a field called hasMoreResults (bool) in the response equals false. Below is the JSON response from the API endpoint where you can see "hasMoreResults"  = True and the "results" section of the JSON has all the returned records:

 

 

{
    "totalResults": 7847,
    "hasMoreResults": true,
    "startIndex": 1,
    "itemsPerPage": 10,
    "results": [],
"facets":[]
}

 

 The startIndex equals the startPage.

 

With this, I am looking for any advice on how to run this query using the pagination rules so that all 7847 results end up in one file. I have tried many different things and feel like I need two pagination rules: AbosulteURL needs to add '1' to every page so it cycles through and then an endCondition where it stops when hasMoreResults = false. Any help with this would be greatly appreciated!

One thing I did as well, in the Until function to make this work is store the "hasMoreResults" bool value into a cached variable and this is my statement for the expression in the Until but can't seem to get this working as a pagination end condition:

 

 

"value": "@not(activity('Org Data flow').output.runStatus.output.sinkHasMoreResults.value[0].hasMoreResults)"

 

These are the current pagination rules that don't seem to work:

 

 

No RepliesBe the first to reply

Resources