Copy Activity

33 Topics

Failure of azure data factory integration runtime with Vnet enabled
I had been using Data Factory's integration runtime with VNet successfully, but it recently stopped connecting to Cosmos DB with the MongoDB API (which is also within a VNet). After setting up a new integration runtime with VNet enabled and selecting the region as 'Auto Resolve,' the pipeline ran successfully with this new runtime. Could you help me understand why the previous integration runtime—configured with VNet enabled and the region set to match that of Azure Data Factory—worked for over a month but then suddenly failed? The new integration runtime with VNet and 'Auto Resolve' region worked, but I'm uncertain if the 'Auto Resolve' region contributed to the success or if something else allowed it to connect. Error:Failure happened on 'Source' side. ErrorCode=MongoDbConnectionTimeout,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=>Connection to MongoDB server is timeout.,Source=Microsoft.DataTransfer.Runtime.MongoDbAtlasConnector,''Type=System.TimeoutException,Message=A timeout occured after 30000ms selecting a server using CompositeServerSelector{ Selectors = MongoDB.Driver.MongoClient+AreSessionsSupportedServerSelector, LatencyLimitingServerSelector{ AllowedLatencyRange = 00:00:00.0150000 } }. Client view of cluster state is { ClusterId : "1", ConnectionMode : "ReplicaSet", Type : "ReplicaSet", State : "Disconnected", Servers : [{ ServerId: "{ ClusterId : 1, EndPoint : "Unspecified/cosmontiv01u.mongo.cosmos.azure.com:10255" }", EndPoint:
BSK
Nov 06, 2024 Place Azure Data Factory
19Views
0likes
0Comments
Incremental Load from Servicenow kb_knowledge table
Hi, I have been trying to copy only new kb data from the kb_knowledge table in servicenow to a blob storage. I tried to use the query builder but it copies all of the kb data. Is there another way to do this?? Thanks in advance!
LakshanaU
Aug 27, 2024 Place Azure Data Factory
136Views
0likes
0Comments
Is there any way to increase source peak connections on "copy data" activity?
Max write connections are set to 32 but I can't find any option (if possible) to increase read connections. It reads from physical SQL server and saves into Azure SQL as shown on screenshot.
Kris_KB
Jul 30, 2024 Place Azure Data Factory
280Views
0likes
1Comment
'Cannot connect to SQL Database' error - please help
Hi, Our organisation is new to Azure Data Factory (ADF) and we're facing an intermittent error with our first Pipeline. Being intermittent adds that little bit more complexity to resolving the error. The Pipeline has two activities: 1) Script activity which deletes the contents of the target Azure SQL Server database table that is located within our Azure cloud instance. 2) Copy data activity which simply copies the entire contents from the external (outside of our domain) third-party source SQL View and loads it to our target Azure SQL Server database table. With the source being external to our domain, we have used a Self-Hosted Integration Runtime. The Pipeline executes once per 24 hours at 3am each morning. I have been informed that this timing shouldn't affect/or by affected by any other Azure processes we have. For the first nine days of Pipeline executions, the Pipeline successfully completed its executions. Then for the next nine days it only completed successfully four times. Now it seems to fail every other time. It's the same error message that is received on each failure - the received error message is below (I've replaced our sensitive internal names with Xs). Operation on target scr__Delete stg__XXXXXXXXXX contents failed: Failed to execute script. Exception: ''Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Cannot connect to SQL Database. Please contact SQL server team for further support. Server: 'XX-azure-sql-server.database.windows.net', Database: 'XX_XXXXXXXXXX_XXXXXXXXXX', User: ''. Check the linked service configuration is correct, and make sure the SQL Database firewall allows the integration runtime to access.,Source=Microsoft.DataTransfer.Connectors.MSSQL,''Type=Microsoft.Data.SqlClient.SqlException,Message=Server provided routing information, but timeout already expired.,Source=Framework Microsoft SqlClient Data Provider,'' To me, if this Pipeline was incorrectly configured then the Pipeline would never have successfully completed, not once. With it being intermittent, but becoming more frequent, suggests it's being caused by something other than its configuration, but I could be wrong - hence requesting help from you. Please can someone advise on what is causing the error and what I can do to verify/resolve the error? Thanks.
AzureNewbie1
Jul 26, 2024 Place Azure Data Factory
822Views
0likes
2Comments
Flattening nested JSON values in a dataflow with varying keys.
We are using Azrue DevOps REST API calls to return JSON files and storing them in blob. Then we perform a dataflow to transform the data. The issue is a portion of the JSON being stored in blob has varying keys. When we specify the columns to map in a Select action, we are selecting specifically one of the varying keys from a list of options. But need to map ALL of these – we cannot manually specify these because the data source is so large. We cannot implement a standard name for this section of the JSON. A wildcard for { } would work ideally but is not supported. We do not care what the keys are, just the contents (id, name). Select Action: Source Column: resources.pipelines.{src-release}.pipeline.id resources.pipelines.{src-release}.pipeline.name resources.pipelines.{build }.pipeline.id resources.pipelines.{build }.pipeline.name Mapping Name as: ‘pipelineID’ ‘pipelineName’ Below is a JSON snippet which highlights the key from the source JSON Example of Select action mapping – each key shows as its own dropdown:
Devendra_Velegandla
Jun 13, 2024 Place Azure Data Factory
258Views
0likes
0Comments
Copy Activity from BLOB CSV to C4C OData Services failes on csrf token
HI There , 1)when trying to get data from C4C to blob using adf we were able to extract data with out any issues . 2)when trying insert the downloaded file back to C4C connection ( sap/c4c/odata/v1/c4codataapi/) using copy Activity in ADF , confronting an issues with Csrf token not supported for the odata endpoint. canyou please provide me how to resolve this conflict. NOTE: the user has sufficient permissions to insert data error LOG: "errors": [ { "Code": 23208, "Message": "ErrorCode=ODataCsrfTokenNotSupported,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Csrf token not supported for the odata endpoint.,Source=Microsoft.DataTransfer.Runtime.ODataConnector,'", "EventType": 0, "Category": 5, "Data": {}, "MsgId": null, "ExceptionType": null, "Source": null, "StackTrace": null, "InnerEventInfos": [] }
Satya2024
Apr 22, 2024 Place Azure Data Factory
298Views
0likes
0Comments
Azure Data Factory Copy Data Activity changes data while copying Parquet data to Dedicated SQL Pool
Hi All, We have a parquet file on a ADLS2 Storage container, that has over 7 million rows of data. We created a Copy Data Activity on Azure Data Factory, to move this data to a table in Dedicated SQL Pool. All the data from the Parquet file goes into the database table accurately, except for this one row, where there is a Decimal value of 78.6 in the parquet file, that goes into the SQL table as 78.5. Here's more context on the steps we took so far to trace the root cause for this issue: We have tried to change the parquet file name and push it to this table -- the data still goes in to the SQL table as 78.5 (when the parquet file has 78.6) We have tried to create a version 2 table in the SQL DB and pushed the data into this V2 table using the Copy Data Activity, still, the data goes in as 78.5 We have checked the compression type used to create the parquet file on our python code (it is GZIP), the compression type used to unzip this parquet file data on Data Factory's Dataset connection -- earlier it was snappy, we changed it to GZIP and re-ran the Copy Data Activity -- and still, the data goes in as 78.5 We have checked the Decimal data type precision and scale, as well as the datatype Mapping from source to sink -- if this was off, the whole dataset should have issues for this column, but it is this one row, that goes in to the SQL table incorrectly. Ask: Has any of you ever encountered this issue before? If so, how did you solve it? Any suggestions are welcome. Thank you!!
sindura-chikkam
Mar 15, 2024 Place Azure Data Factory
380Views
0likes
0Comments
How to handle azure data factory lookup activity with more than 5000 records
Hello Experts, The DataFlow Activity successfully copies data from an Azure Blob Storage .csv file to Dataverse Table Storage. However, an error occurs when performing a Lookup on the Dataverse due to excessive data. This issue is in line with the documentation, which states that the Lookup activity has a limit of 5,000 rows and a maximum size of 4 MB. Also, there is a Workaround mentioned (Micrsofot Documentation): Design a two-level pipeline where the outer pipeline iterates over an inner pipeline, which retrieves data that doesn't exceed the maximum rows or size. How can I do this? Is there a way to define an offset (e.g. only read 1000 rows) Thanks, -Sri
DynamicsHulk
Mar 08, 2024 Place Azure Data Factory
2.7KViews
0likes
1Comment
Call unbound custom action (Dynamics) from the ADF
Hello Experts, I have a global action in Dynamics and need to initiate a call from Azure Data Factory to pass input parameters and retrieve output parameters. Are there any methods available to invoke the unbound custom action from ADF? Kindly provide recommendations. Thanks, -Sri
DynamicsHulk
Feb 20, 2024 Place Azure Data Factory
270Views
0likes
0Comments
Tuning Infrastructure performance with Managed Vnet Integration Runtime
Hi, I set up jobs to run in a managed Vnet IR as we are using Vnets to secure databases. I have a lot of jobs and see that it generates a lot of queue time for the pipeline copy activities. Does anyone have any guidance on tuning performance to reduce Queue time? This appears to add up to a lot of which will be billed time for teh cluster Most tables are small, but there are many of them roughly 900 being carried out 5 times from same application source databases. Any ideas? Peter
turagittech63
Feb 13, 2024 Place Azure Data Factory
295Views
0likes
0Comments