Forum Widgets
Latest Discussions
Specific Use Case: REST API Pagination in Data Factory
Hello, I seem to have a specific use case in regards to ingesting data from an REST API endpoint and struggling on how to use pagination within the source instead of using the Until function. I got the Until function to work and it cycles through my pages, but the issue is that it creates a new document per page when I want all the information consolidated into one file/blob. For my REST API endpoint, I have a base url that doesn't change and a relative url that uses a start page and a count. The start page is the obvious page to start the call on and the count is the number of records it will return. I have set these up as parameters in the source with start page = 1 and count = 400. For this particular call, using the Until function results in 19 separate pages of 400 by adding '1' to the start page for each call until a field called hasMoreResults (bool) in the response equals false. Below is the JSON response from the API endpoint where you can see "hasMoreResults" = True and the "results" section of the JSON has all the returned records: { "totalResults": 7847, "hasMoreResults": true, "startIndex": 1, "itemsPerPage": 10, "results": [], "facets":[] } The startIndex equals the startPage. With this, I am looking for any advice on how to run this query using the pagination rules so that all 7847 results end up in one file. I have tried many different things and feel like I need two pagination rules: AbosulteURL needs to add '1' to every page so it cycles through and then an endCondition where it stops when hasMoreResults = false. Any help with this would be greatly appreciated! One thing I did as well, in the Until function to make this work is store the "hasMoreResults" bool value into a cached variable and this is my statement for the expression in the Until but can't seem to get this working as a pagination end condition: "value": "@not(activity('Org Data flow').output.runStatus.output.sinkHasMoreResults.value[0].hasMoreResults)" These are the current pagination rules that don't seem to work:bones_clarkeMar 19, 2025Copper Contributor554Views0likes1CommentIssue with Auto Setting for Copy Parallelism in ADF Copy Activity
Hello everyone, I've been utilizing Azure Data Factory (ADF) and noticed the option to set the degree of copy parallelism in a copy activity, which can significantly enhance performance when copying data, such as blob content to an SQL table. However, despite setting this option to "Auto," the degree of parallelism remains fixed at 1. This occurs even when copying hundreds of millions of rows, resulting in a process that takes over 2 hours. My Azure SQL database is scaled to 24 vCores, which should theoretically support higher parallelism. Am I missing something, or is the "Auto" setting for copy parallelism not functioning as expected? Any insights or suggestions would be greatly appreciated! Thank you.kzngMar 17, 2025Copper Contributor27Views0likes1CommentMigration Data Factory pipelines between tenants
Hi everybody. I need your help please. I'm trying to migrate several Data Factory pipelines between 2 diferent fabric tenants. I'm using Azure DevOps to move all the workspaces, I created the connections with the same name but when I try to restore the data factory pipelines it return an error than datafactory pipielines can't be created because doesn't find the connections. I was trying to update the connection ID but I don't find them into the json file. How can I migrate these data factories and reconnect to the new connections?nannyhgMar 12, 2025Copper Contributor29Views0likes0CommentsAzure Devops and Data Factory
I have started a new job and taken over ADF. I know how to use Devops to integrate and deploy when everything is up and running. The problem is, it's all out of sync. I need to learn ADO/ADF as they work together so I can fix this. Any recommendations on where to start? Everything on YouTube is starting with a fresh environment which I'd be fine with. I'm not new to ADO, but I've never been the setup guy before. And I'm strong on ADO management, just using it. Here are some of the problems I have: A lot of work has been done directly in the DEV branch rather than creating feature branches. Setting up a pull request from DEV to PROD wants to pull everything. Even in-progress or abandoned code changes. Some changes were made in the PROD branch directly, so I'll need to pull those changes back to DEV. We have valid changes in both DEV and PROD. I'm having trouble cherry-picking. It only lets me select one commit, then says I need to use command-line. It doesn't tell me the error. I don't know what tool to use for the command line. I've tried using Visual Studio, and I can pull in the Data Factory code, but have all the same problems there. I'm not looking for an answer to the questions, but how to find the answer to these questions. Is this Data Factory, or should I be looking at Devops? I'm having no trouble managing the database code or Power BI in Devops, but I created that fresh. Thanks for any help!Solvedbcarlson_fMar 11, 2025Copper Contributor102Views0likes3CommentsAzure Data Factory Mapping Dataflow Key Pair Authenticiation Snowflake
Dear Microsoft, As Snowflake announced that they will remove the basic authentication (username + passwort) on September 2025, I wanted to change my Authentication Method in a mapping dataflow in Azure Data Factory. I got a Error Message and found out, that only basic authentication is allowed in the mapping dataflow: Copy and transform data in Snowflake V2 - Azure Data Factory & Azure Synapse | Microsoft Learn Is it going to be fixed in ADF in near future? Or is my Process in September broken?marius1106Mar 10, 2025Copper Contributor25Views0likes0CommentsLinux Support for Self-Hosted Integration Runtimes (SHIR)
Hi. Azure Support asked me to request this here. We would very much like to run self-hosted integration runtimes (SHIRs) on Linux instead of Windows. Currently we run them ACI and they take almost 10 minutes to start. They are also a bit klunky and difficult to manage on ACI, we would much rather run them in our AKS cluster alongside all our other Linux containers. Is Linux container support for SHIRs on the roadmap, and if not, can it be? Regards, Tim.tgollyMar 10, 2025Copper Contributor22Views0likes0CommentsAlter Row Ignoring its Conditions
Hello. I have an ADF Dataflow which has two sources, a blob container with JSON files and an Azure SQL table. The sink is the same SQL table as the SQL source, the idea being to conditionally insert new rows, update rows with a later modified date in the JSON source or do nothing if the ID exists in the SQL table with the same modified date. In the Dataflow I join the rows on id, which is unique in both sources, and then use an Alter row action to insert if the id column from the SQL source is null, update if it's not null but the last updated timestamp in the JSON source is newer, or delete if the last updated timestamp in the JSON source is the same or older (delete is not permitted in the sink settings so that should ignore/do nothing). The problem I'm having is I get a primary key violation error when running the Dataflow as it's trying to insert rows that already exist: For example in my run history (160806 is the minimum value for ID in the SQL database): So for troubleshooting I put a filter directly after each source for that ticket ID so when I'm debugging I only see that single row. Now here is the configuration of my Alter row action: It should insert only if the SQLTickets id column is null, but here in the data preview from the same Alter rows action. It's marked as an insert, despite the id column from both sources clearly having a value: However, when I do a data preview in the expression builder itself, it correctly evaluates to false: I'm so confused. I've used this technique in other Dataflows without any issues so I really have no idea what's going on here. I've been troubleshooting it for days without any result. I've even tried putting a filter after the Alter row action to explicitly filter out rows where the SQL id column is not null and the timestamps are the same. The data preview shows them filtered out but yet it still tries to insert the rows it should be ignoring or updating anyway when I do a test run. What am I doing wrong here?williampageMar 08, 2025Copper Contributor40Views0likes0CommentsDynamically executing a child pipeline using a single Execute Pipeline activity with a variable
Goal: Create a master pipeline that: Retrieves metadata using a lookup. Calculates a value (caseValue) from the lookup result. Maps the value (caseValue) to a pipeline name using a JSON string (pipelineMappingJson). Sets the pipeline name (pipelineName) dynamically. Runs the correct child pipeline using the pipelineName variable. Question: Can the Execute Pipeline activity be updated to handle dynamic child pipeline names?NithyanandSulegai2Mar 08, 2025Copper Contributor69Views0likes3CommentsADF Google Ads Linked Service using Service Authentication as the Authentication Type
We are trying to access Google Ads data using a Google Ads Service Account, and the ADF Google Ads Linked Service. We have set the linked service "Authentication type" to be "Service authentication". We generated a private key for this Service Account in Google, and we have used the key as the value in the "Private key" field of the linked service. We have populated the other required linked service fields (Name, Client customer ID, Developer token, Email), and also the optional "Login customer ID" field. We have also designated the linked service to use a self-hosted integration runtime instead of the AutoResolveIntegrationRuntime. When testing the connection, we are receiving this error message: Test connection operation failed. Failed to open the database connection. Fail to read from Google Ads. Parameter was empty Parameter name: pkcs8PrivateKey Does anyone in the Tech Community use this new version of the Google Ads linked service with the Authentication Type set to "Service authentication" instead of "User authentication"? Does anyone have any insight about the error message we are receiving?TerrySSheltonFeb 27, 2025Copper Contributor35Views0likes0Comments
Resources
Tags
- Azure Data Factory155 Topics
- Azure ETL39 Topics
- Copy Activity35 Topics
- Azure Data Integration33 Topics
- Mapping Data Flows26 Topics
- Azure Integration Runtime22 Topics
- Data Flows3 Topics
- azure data factory v23 Topics
- ADF3 Topics
- REST2 Topics