Forum Discussion

Black1500's avatar
Black1500
Copper Contributor
Jul 03, 2024

Fetch all IDs from API

I am currently working on fetching data from an API endpoint structured as follows: http://api.com/api/placements/{placementsid}. This endpoint allows me to retrieve data for a specific placement by using a unique placementsid.

The challenge I am facing is:

When I specify a single placementsid in the URL, the API returns data for only that specific placement.However, my objective is to retrieve data for all placements available in the system. Questions:

How can I modify my approach to fetch all placement IDs from this API endpoint?

Is there a specific method or endpoint that allows me to get a list of all placementsid values? Once I have all the placement IDs, how do I structure my pipeline to iterate over each ID and fetch the corresponding data?

What should I specify in the "Relative URL" field of my data source configuration to ensure I can dynamically fetch data for each placement ID? What are the best practices for aggregating or storing the fetched data in Azure Data Factory (ADF)?

Should I use a particular activity or sequence of activities in ADF to handle this process efficiently?

  • zack-sponaugle's avatar
    zack-sponaugle
    Copper Contributor

    Hello Black1500 ,

     

    How can I modify my approach to fetch all placement IDs from this API endpoint?

    Is there a specific method or endpoint that allows me to get a list of all placementsid values? Once I have all the placement IDs, how do I structure my pipeline to iterate over each ID and fetch the corresponding data?

     

    Every API is different but what I think you are looking for is pagination. Most APIs are designed to send data over in batches instead of all at once. ADF has some built in support for this which you can read about here:

    https://learn.microsoft.com/en-us/azure/data-factory/connector-rest?tabs=data-factory#pagination-support

     

    Does the API documentation mention anything about "limit", "offset", or "lastpage" request parameters? These are often used with pagination.

     

    What should I specify in the "Relative URL" field of my data source configuration to ensure I can dynamically fetch data for each placement ID? What are the best practices for aggregating or storing the fetched data in Azure Data Factory (ADF)?

     

    Should I use a particular activity or sequence of activities in ADF to handle this process efficiently?

     

    Without knowing the API I cannot say for sure. But if its a fairly standard API, hopefully you will be able to leverage the built-in pagination support. This will handle most of the "heavy lifting" behind the scenes. Otherwise you can get creative using an until loop, lookups, and set variable activities.

     

    For aggregation and storage of the data, it will depend on the destination of your data, but a common pattern is to temporarily store the data in Azure storage (Blob or Data Lake Gen 2). After you are finished retrieving all the data from the API, you can copy the data from Azure storage into a database of your choosing.

     

    Kind Regards,

    Zack

     

     

Resources