Forum Widgets
Latest Discussions
Oracle 2.0 Upgrade Woes with Self-Hosted Integration Runtime
This past weekend my ADF instance finally got the prompt to upgrade linked services that use the Oracle 1.0 connector, so I thought, "no problem!" and got to work upgrading my self-hosted integration runtime to 5.50.9171.1 Most of my connection use service_name during authentication, so according to the docs, I should be able to connect using the Easy Connect (Plus) Naming convention. When I do, I encounter this error: Test connection operation failed. Failed to open the Oracle database connection. ORA-50201: Oracle Communication: Failed to connect to server or failed to parse connect string ORA-12650: No common encryption or data integrity algorithm https://docs.oracle.com/error-help/db/ora-12650/ I did some digging on this error code, and the troubleshooting doc suggests that I reach out to my Oracle DBA to update Oracle server settings. Which, I did, but I have zero confidence the DBA will take any action. https://learn.microsoft.com/en-us/azure/data-factory/connector-troubleshoot-oracle Then I happened across this documentation about the upgraded connector. https://learn.microsoft.com/en-us/azure/data-factory/connector-oracle?tabs=data-factory#upgrade-the-oracle-connector Is this for real? ADF won't be able to connect to old versions of Oracle? If so I'm effed because my company is so so legacy and all of our Oracle servers at 11g. I also tried adding additional connection properties in my linked service connection like this, but I have honestly no idea what I'm doing: Encryption client: accepted Encryption types client: AES128, AES192, AES256, 3DES112, 3DES168 Crypto checksum client: accepted Crypto checksum types client: SHA1, SHA256, SHA384, SHA512 But no matter what, the issue persists. :( Am I missing something stupid? Are there ways to handle the encryption type mismatch client-side from the VM that runs the self-hosted integration runtime? I would hate to be in the business of managing an Oracle environment and tsanames.ora files, but I also don't want to re-engineer almost 100 pipelines because of a connector incompatability.adaardorMay 15, 2025Copper Contributor93Views2likes3CommentsError in copy activity with Oracel 2.0
I am trying to migrate our copy activities to Oracle connector version 2.0. The destination is parquet in Azure Storage account which works with Oracle 1.0 connecter. Just switching to 2.0 on the linked service and adjusting the connection string (server) is straight forward and a "test connection" is successful. But in a pipeline with a copy activity using the linked service I get the following error message on some tables. ErrorCode=ParquetJavaInvocationException,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=An error occurred when invoking java, message: java.lang.ArrayIndexOutOfBoundsException:255 total entry:1 com.microsoft.datatransfer.bridge.parquet.ParquetWriterBuilderBridge.addDecimalColumn(ParquetWriterBuilderBridge.java:107) .,Source=Microsoft.DataTransfer.Richfile.ParquetTransferPlugin,''Type=Microsoft.DataTransfer.Richfile.JniExt.JavaBridgeException,Message=,Source=Microsoft.DataTransfer.Richfile.HiveOrcBridge,' As the error suggests in is unable to convert a decimal value from Oracle to Parquet. To me it looks like a bug in the new connector. Has anybody seen this before and have found a solution? The 1.0 connector is apparently being deprecated in the coming weeks. Here is the code for the copy activity: { "name": "Copy", "type": "Copy", "dependsOn": [], "policy": { "timeout": "1.00:00:00", "retry": 2, "retryIntervalInSeconds": 60, "secureOutput": false, "secureInput": false }, "userProperties": [ { "name": "Source", "value": "@{pipeline().parameters.schema}.@{pipeline().parameters.table}" }, { "name": "Destination", "value": "raw/@{concat(pipeline().parameters.source, '/', pipeline().parameters.schema, '/', pipeline().parameters.table, '/', formatDateTime(pipeline().TriggerTime, 'yyyy/MM/dd'))}/" } ], "typeProperties": { "source": { "type": "OracleSource", "oracleReaderQuery": { "value": "SELECT @{coalesce(pipeline().parameters.columns, '*')}\nFROM \"@{pipeline().parameters.schema}\".\"@{pipeline().parameters.table}\"\n@{if(variables('incremental'), variables('where_clause'), '')}\n@{if(equals(pipeline().globalParameters.ENV, 'dev'),\n'FETCH FIRST 1000 ROWS ONLY'\n,''\n)}", "type": "Expression" }, "partitionOption": "None", "convertDecimalToInteger": true, "queryTimeout": "02:00:00" }, "sink": { "type": "ParquetSink", "storeSettings": { "type": "AzureBlobFSWriteSettings" }, "formatSettings": { "type": "ParquetWriteSettings", "maxRowsPerFile": 1000000, "fileNamePrefix": { "value": "@variables('file_name_prefix')", "type": "Expression" } } }, "enableStaging": false, "translator": { "type": "TabularTranslator", "typeConversion": true, "typeConversionSettings": { "allowDataTruncation": true, "treatBooleanAsNumber": false } } }, "inputs": [ { "referenceName": "Oracle", "type": "DatasetReference", "parameters": { "host": { "value": "@pipeline().parameters.host", "type": "Expression" }, "port": { "value": "@pipeline().parameters.port", "type": "Expression" }, "service_name": { "value": "@pipeline().parameters.service_name", "type": "Expression" }, "username": { "value": "@pipeline().parameters.username", "type": "Expression" }, "password_secret_name": { "value": "@pipeline().parameters.password_secret_name", "type": "Expression" }, "schema": { "value": "@pipeline().parameters.schema", "type": "Expression" }, "table": { "value": "@pipeline().parameters.table", "type": "Expression" } } } ], "outputs": [ { "referenceName": "Lake_PARQUET_folder", "type": "DatasetReference", "parameters": { "source": { "value": "@pipeline().parameters.source", "type": "Expression" }, "namespace": { "value": "@pipeline().parameters.schema", "type": "Expression" }, "entity": { "value": "@variables('sink_table_name')", "type": "Expression" }, "partition": { "value": "@formatDateTime(pipeline().TriggerTime, 'yyyy/MM/dd')", "type": "Expression" }, "container": { "value": "@variables('container')", "type": "Expression" } } } ] }martin_larsson_ellevioMay 14, 2025Copper Contributor40Views0likes2CommentsADF dataflow data Preview Error
hi All, I have data flow as seen below. all linked service and data set working fine and i can see the data preview but wheb i use the same linked service and dateset in the dataflow It throw error as shown below i am useing managed private endpoint to coonect the blob starga it is owrking for all pipe line. the ADF and the MI has staorgae account contributor role assigned. Error: at Source 'sourcedata': This request is not authorized to perform this operation. When using Managed Identity(MI)/Service Principal(SP) authentication 1. For source: In Storage Explorer, grant the MI/SP at least Execute permission for ALL upstream folders and the file system, along with Read permission for the files to copy. Alternatively, in Access control (IAM), grant the MI/SP at least the Storage Blob Data Reader role. 2. For sink: In Storage Explorer, grant the MI/SP at least Execute permission for ALL upstream folders and the file system, along with Write permission for the sink folder. Alternatively, in Access control (IAM), grant the MI/SP at least the Storage Blob Data Contributor role. Also please ensure that the network firewall settings in the storage account are configured correctly as turning on firewall rules for your storage account blocks incoming requests for data by default, unless the requests originate from a service operating within an Azure Virtual Network (VNet) or from allowed public IP addresses. Any kind of help is highly appreciatedzenbabasha55May 11, 2025Copper Contributor41Views0likes1Commentcopy data fails - best practice?
Hi Everyone, We have the need to start to copy (some) data from various inhouse sql database tables to an Azure online database. So far what I have done, to keep the updates times to a minimum, is to create static and live tables where the tables are large. the static tables will incorporate data up to the start of the year as a one off pipeline. the live is upserts daily pipeline. After a lot of tweaking I got the Lives pipelines working - generally fail free. However with the large tables I am really struggling to copy the data over (usually failing after 5 hours of progress) the tables generally have 4-5 million rows and maybe 50 columns. I've played around with various settings but am now wondering if this is the best method of copying large amounts of data from on prem to cloud databases?Roop_s610May 11, 2025Copper Contributor18Views0likes1CommentDataflow snowflake connection issue
I'm trying to set up a sink to snowflake in dataflow, but when I test the connection it doesn't work It just comes out JDBC driver communication error, tried searching online and looking at documentation link but couldn't find anything about this issue. But this same dataset works fine outside of the dataflow, I can preview the data in same dataset: There seems to be issue with dataflow. Even when I try to execute the dataflow through pipeline the same error message comes up: Does anyone know how to solve this problem with dataflows? Also with the sink settings option if I select recreate table will it create the table in snowflake if it doesn't already exist? Trying to find an easy way to copy a lot of tables into snowflake without explicitly having to create the table first especially if the metadata is only known at runtime, the pipeline copy job doesn't work as the table has to exist before it can insert data into it, but dataflow seems promising if the connection actually works.MangoMagicMay 09, 2025Copper Contributor110Views0likes2CommentsADF Data Flow Fails with "Path does not resolve to any file" — Dynamic Parameters via Trigger
Hi guys, I'm running into an issue with my Azure Data Factory pipeline triggered by a Blob event. The trigger passes dynamic folderPath and fileName values into a parameterized dataset and mapping data flow. Everything works perfectly when I debug the pipeline manually or trigger the pipeline manually with the trigger and pass in the values for folderPath and fileName directly. However, when the pipeline is triggered automatically via the blob event, the data flow fails with the following error: Error Message: Job failed due to reason: at Source 'CSVsource': Path /financials/V02/Forecast/ForecastSampleV02.csv does not resolve to any file(s). Please make sure the file/folder exists and is not hidden. At the same time, please ensure special character is not included in file/folder name, for example, name starting with _ I've verified the blob file exists. The trigger fires correctly and passes parameters The path looks valid. The dataset is parameterized correctly with @dataset().folderPath and @dataset().fileName I've attached screenshots of: 🔵 00-Pipeline Trigger Configuration On Blob creation 🔵 01-Trigger Parameters 🔵 02-Pipeline Parameters 🔵 03-Data flow Parameters 🔵 04-Data flow Parameters without default value 🔵 05-Data flow CSVsource parameters 🔵 06-Data flow Source Dataset 🔵 07-Data flow Source dataset Parameters 🔵 08-Data flow Source Parameters 🔵 09-Parameters passed to the pipeline from the trigger 🔵 10-Data flow error message Here are all the images What could be causing the data flow to fail on file path resolution only when triggered, even though the exact same parameters succeed during manual debug runs? Could this be related to: Extra slashes or encoding in trigger output? Misuse of @dataset().folderPath and fileName in the dataset? Limitations in how blob trigger outputs are parsed? Any insights would be appreciated! Thank youSolvedJohnG_PITMay 06, 2025Copper Contributor31Views0likes1CommentWhat Are the Ways to Dynamically Invoke Pipelines in ADF from Another Pipeline?
I am exploring different approaches to dynamically invoke ADF pipelines from within another pipeline as part of a modular and scalable orchestration strategy. My use case involves having multiple reusable pipelines that can be called conditionally or in sequence, based on configuration stored externally (such as in a SQL Managed Instance or another Azure-native source). I am aware of a few patterns like using the Execute Pipeline activity within a ForEach loop, but I would like to understand the full range of available and supported options for dynamically invoking pipelines from within ADF. Could you please clarify the possible approaches for achieving this? Specifically, I am interested in: Using ForEach with Execute Pipeline activity How to structure the control flow for calling multiple pipelines in sequence or parallel. How to pass pipeline names dynamically. Dynamic pipeline name resolution Is it possible to pass the pipeline name as a parameter to the Execute Pipeline activity? How to handle validation when the pipeline name is dynamic? Parameterized execution Best practices for passing dynamic parameters to each pipeline when calling them in a loop or based on external config. Calling ADF pipelines via REST API or Web Activity When would this be preferred over native Execute Pipeline? How to handle authentication and response handling? If there are any recommendations, gotchas, or best practices related to dynamic pipeline orchestration in ADF, I would greatly appreciate your insights. Thanks!manujApr 29, 2025Copper Contributor13Views0likes0CommentsHow to Orchestrate ADF Pipelines as Selectable Steps in a Configurable Job
I am working on building a dynamic job orchestration mechanism using Azure Data Factory (ADF). I have multiple pipelines in ADF, and each pipeline represents a distinct step in a larger job. I would like to implement a solution where I can dynamically select or deselect individual pipeline steps (i.e., ADF pipelines) as part of a job. The idea is to configure a job by checking/unchecking steps, and then execute only the selected ones in sequence or based on dependencies. Available resources for this solution: Azure Data Factory (ADF) Azure SQL Managed Instance (SQL MI) Any other relevant Azure-native service (if needed) Could you please suggest a solution that meets the following requirements: Dynamically configure which pipelines (steps) to include in a job. Add or remove steps without changing hardcoded logic in ADF. Ensure scalability and maintainability of the orchestration logic. Keep the solution within the scope of ADF, SQL MI, and potentially other Azure-native services (no external apps or third-party orchestrators). Any design pattern, architecture recommendations, or examples would be greatly appreciated. Thanks!manujApr 29, 2025Copper Contributor19Views0likes0CommentsHow to Configure Authentication for Web Activity Triggering ADF Pipelines via Azure REST API
Hello, I am working on integrating Azure Data Factory (ADF) with external systems using Web Activities. I am specifically using a Web Activity to trigger ADF pipelines via the Azure REST API, as described in the official documentation here: https://learn.microsoft.com/en-us/rest/api/datafactory/pipelines/create-run?view=rest-datafactory-2018-06-01 I can configure the request method and URL in the Web Activity, but I am unsure about the supported and recommended methods for authentication. Could someone please clarify: What are the possible ways to configure authentication in Web Activities when calling Azure REST APIs (such as for creating a pipeline run)? Is it possible to use Managed Identity (System-assigned or User-assigned) directly within the Web Activity? If not, what are the alternatives (e.g., service principal with token acquisition)? Are there any best practices or security considerations when configuring authentication for this use case? Thanks in advance for your help!6Views0likes0CommentsAzure Devops and Data Factory
I have started a new job and taken over ADF. I know how to use Devops to integrate and deploy when everything is up and running. The problem is, it's all out of sync. I need to learn ADO/ADF as they work together so I can fix this. Any recommendations on where to start? Everything on YouTube is starting with a fresh environment which I'd be fine with. I'm not new to ADO, but I've never been the setup guy before. And I'm strong on ADO management, just using it. Here are some of the problems I have: A lot of work has been done directly in the DEV branch rather than creating feature branches. Setting up a pull request from DEV to PROD wants to pull everything. Even in-progress or abandoned code changes. Some changes were made in the PROD branch directly, so I'll need to pull those changes back to DEV. We have valid changes in both DEV and PROD. I'm having trouble cherry-picking. It only lets me select one commit, then says I need to use command-line. It doesn't tell me the error. I don't know what tool to use for the command line. I've tried using Visual Studio, and I can pull in the Data Factory code, but have all the same problems there. I'm not looking for an answer to the questions, but how to find the answer to these questions. Is this Data Factory, or should I be looking at Devops? I'm having no trouble managing the database code or Power BI in Devops, but I created that fresh. Thanks for any help!Solvedbcarlson_fApr 24, 2025Copper Contributor150Views0likes4Comments
Resources
Tags
- azure data factory167 Topics
- Azure ETL44 Topics
- Azure Data Integration36 Topics
- Copy Activity36 Topics
- Mapping Data Flows27 Topics
- Azure Integration Runtime23 Topics
- azure data factory v23 Topics
- ADF3 Topics
- Data Flows3 Topics
- pipeline3 Topics