Azure ETL

47 Topics

User Properties of Activities in ADF: How to add dynamic content in it?
On ADF, I am using a for each loop in which I am using an Execute Pipeline Activity which is getting executed for different iterations as per the values of the items provided to the For-Each Loop. I am stuck on a scenario which requires me to add the Dynamic Content Expression in the User Properties of individual activities of ADF. Specific to my case, I want to add the Dynamic Content Expression in the User Properties of Execute Pipeline Activity so that I get to individual runs of these activities on Azure Monitor with a specific label attached to it through its User Properties. The necessity to add the Dynamic Content Expression in the User Properties is due to the reason that each execution in respective iterations of these activities corresponds to a particular Step from a set of Steps configured for the Data Load Job as a whole, which has been orchestrated through ADF. To identify the association with the respective Job-Step, I require to add Dynamic Content Expression in its User Properties. Any sort of response regarding this is highly appreciated. Thank You!
manuj
Nov 13, 2025 Place Azure Data Factory
24Views
1like
0Comments
Solution: Handling Concurrency in Azure Data Factory with Marker Files and Web Activities
Hi everyone, I wanted to share a concurrency issue we encountered in Azure Data Factory (ADF) and how we resolved it using a small but effective enhancement—one that might be useful if you're working with shared Blob Storage across multiple environments (like Dev, Test, and Prod). Background: Shared Blob Storage & Marker Files In our ADF pipelines, we extract data from various sources (e.g., SharePoint, Oracle) and store them in Azure Blob Storage. That Blob container is shared across multiple environments. To prevent duplicate extractions, we use marker files: started.marker → created when a copy begins completed.marker → created when the copy finishes successfully If both markers exist, pipelines reuse the existing file (caching logic). This mechanism was already in place and worked well under normal conditions. The Issue: Race Conditions We observed that simultaneous executions from multiple environments sometimes led to: Overlapping attempts to create the same started.marker Duplicate copy activities Corrupted Blob files This became a serious concern because the Blob file was later loaded into Azure SQL Server, and any corruption led to failed loads. The Fix: Web Activity + REST API To solve this, we modified only the creation of started.marker by: Replacing Copy Activity with a Web Activity that calls the Azure Storage REST API The API uses Azure Blob Storage's conditional header If-None-Match: * to safely create the file only if it doesn't exist If the file already exists, the API returns "BlobAlreadyExists", which the pipeline handles by skipping. The Copy Activity is still used to copy the data and create the completed.marker—no changes needed there. Updated Flow Check marker files: If both exist (started and completed) → use cached file If only started.marker → wait and retry If none → continue to step 2 Web Activity calls REST API to create started.marker Success → proceed with copy in step 3 Failure → another run already started → skip/retry Copy Activity performs the data extract Copy Activity creates completed.marker Benefits Atomic creation of started.marker → no race conditions Minimal change to existing pipeline logic with marker files Reliable downstream loads into Azure SQL Server Preserves existing architecture (no full redesign) Would love to hear: Have you used similar marker-based patterns in ADF? Any other approaches to concurrency control that worked for your team? Thanks for reading! Hope this helps someone facing similar issues.
mkoralage
Jun 29, 2025 Place Azure Data Factory
66Views
0likes
0Comments
How to Flatten Nested Time-Series JSON from API into Azure SQL using ADF Mapping Data Flow?
How to Flatten Nested Time-Series JSON from API into Azure SQL using ADF Mapping Data Flow? Hi Community, I'm trying to extract and load data from API returning the following JSON format into an Azure SQL table using Azure Data Factory. { "2023-07-30": [], "2023-07-31": [], "2023-08-01": [ { "breakdown": "email", "contacts": 2, "customers": 2 } ], "2023-08-02": [], "2023-08-03": [ { "breakdown": "direct", "contacts": 5, "customers": 1 }, { "breakdown": "referral", "contacts": 3, "customers": 0 } ], "2023-08-04": [], "2023-09-01": [ { "breakdown": "direct", "contacts": 76, "customers": 40 } ], "2023-09-02": [], "2023-09-03": [] } Goal: I want to flatten this nested structure and load it into Azure SQL like this: Expand table ReportDate Breakdown Contacts Customers 2023-07-30 (no row) (no row) (no row) 2023-07-31 (no row) (no row) (no row) 2023-08-01 email 2 2 2023-08-02 (no row) (no row) (no row) 2023-08-03 direct 5 1 2023-08-03 referral 3 0 2023-08-04 (no row) (no row) (no row) 2023-09-01 direct 76 40 2023-09-02 (no row) (no row) (no row) 2023-09-03 (no row) (no row) (no row)
SYN
Jun 06, 2025 Place Azure Data Factory
85Views
0likes
1Comment
ADF dataflow data Preview Error
hi All, I have data flow as seen below. all linked service and data set working fine and i can see the data preview but wheb i use the same linked service and dateset in the dataflow It throw error as shown below i am useing managed private endpoint to coonect the blob starga it is owrking for all pipe line. the ADF and the MI has staorgae account contributor role assigned. Error: at Source 'sourcedata': This request is not authorized to perform this operation. When using Managed Identity(MI)/Service Principal(SP) authentication 1. For source: In Storage Explorer, grant the MI/SP at least Execute permission for ALL upstream folders and the file system, along with Read permission for the files to copy. Alternatively, in Access control (IAM), grant the MI/SP at least the Storage Blob Data Reader role. 2. For sink: In Storage Explorer, grant the MI/SP at least Execute permission for ALL upstream folders and the file system, along with Write permission for the sink folder. Alternatively, in Access control (IAM), grant the MI/SP at least the Storage Blob Data Contributor role. Also please ensure that the network firewall settings in the storage account are configured correctly as turning on firewall rules for your storage account blocks incoming requests for data by default, unless the requests originate from a service operating within an Azure Virtual Network (VNet) or from allowed public IP addresses. Any kind of help is highly appreciated
zenbabasha55
May 11, 2025 Place Azure Data Factory
201Views
0likes
1Comment
ADF Data Flow Fails with "Path does not resolve to any file" — Dynamic Parameters via Trigger
Hi guys, I'm running into an issue with my Azure Data Factory pipeline triggered by a Blob event. The trigger passes dynamic folderPath and fileName values into a parameterized dataset and mapping data flow. Everything works perfectly when I debug the pipeline manually or trigger the pipeline manually with the trigger and pass in the values for folderPath and fileName directly. However, when the pipeline is triggered automatically via the blob event, the data flow fails with the following error: Error Message: Job failed due to reason: at Source 'CSVsource': Path /financials/V02/Forecast/ForecastSampleV02.csv does not resolve to any file(s). Please make sure the file/folder exists and is not hidden. At the same time, please ensure special character is not included in file/folder name, for example, name starting with _ I've verified the blob file exists. The trigger fires correctly and passes parameters The path looks valid. The dataset is parameterized correctly with @dataset().folderPath and @dataset().fileName I've attached screenshots of: 🔵 00-Pipeline Trigger Configuration On Blob creation 🔵 01-Trigger Parameters 🔵 02-Pipeline Parameters 🔵 03-Data flow Parameters 🔵 04-Data flow Parameters without default value 🔵 05-Data flow CSVsource parameters 🔵 06-Data flow Source Dataset 🔵 07-Data flow Source dataset Parameters 🔵 08-Data flow Source Parameters 🔵 09-Parameters passed to the pipeline from the trigger 🔵 10-Data flow error message https://primeinnovativetechnologies-my.sharepoint.com/:b:/g/personal/john_primeinntech_com/EYoH5Sm_GaFGgvGAOEpbdXQB7QJFeXvbFmCbZiW85PwrNA?e=0yjeJR What could be causing the data flow to fail on file path resolution only when triggered, even though the exact same parameters succeed during manual debug runs? Could this be related to: Extra slashes or encoding in trigger output? Misuse of @dataset().folderPath and fileName in the dataset? Limitations in how blob trigger outputs are parsed? Any insights would be appreciated! Thank you
Solved
JohnG_PIT
May 06, 2025 Place Azure Data Factory
196Views
0likes
1Comment
How to Configure Authentication for Web Activity Triggering ADF Pipelines via Azure REST API
Hello, I am working on integrating Azure Data Factory (ADF) with external systems using Web Activities. I am specifically using a Web Activity to trigger ADF pipelines via the Azure REST API, as described in the official documentation here: https://learn.microsoft.com/en-us/rest/api/datafactory/pipelines/create-run?view=rest-datafactory-2018-06-01 I can configure the request method and URL in the Web Activity, but I am unsure about the supported and recommended methods for authentication. Could someone please clarify: What are the possible ways to configure authentication in Web Activities when calling Azure REST APIs (such as for creating a pipeline run)? Is it possible to use Managed Identity (System-assigned or User-assigned) directly within the Web Activity? If not, what are the alternatives (e.g., service principal with token acquisition)? Are there any best practices or security considerations when configuring authentication for this use case? Thanks in advance for your help!
manuj
Apr 29, 2025 Place Azure Data Factory
64Views
0likes
0Comments
Parse JSON Object file
Hi All, I hope you are all doing good, I am new to ADF, finding a way to parse JSON in the 'changedata' column, and repeat the remaining columns for each parsed 'changedata' json using ADF, my data source is Dataverse (audit table), Thank you in advance
sunilReddy
Apr 15, 2025 Place Azure Data Factory
57Views
0likes
0Comments
Excel column header verification using schema in database
I have a requirement where we need to do data quality check on the excel files in Azure Blob with the Schema stored in the Database. Azure Blob has a container in which we have multiple excel files with data. These files generally follow a structure and few business rules, for example, if the data is related to employee there will be 10 columns, all rows in colA = 'abc' (same data), colB should be date in some format, colC is number and less than 5 and likewise. Similarly different excels have different headers, no of columns, structure and business rules. A table is maintained in the database with the structure and business rules. ExcelTemplateId ExcelTemplateName ColumnName MaxLength DataType DefaultValue 1 abc name 255 varchar 1 abc empId 10 int 1 abc dept 100 xyz I need to create an adf pipeline which will read the excel files one by one from the source and compare with the schema (present in the database) and copy the good data to location01 and bad data to location02. Location01 and 02 can be a table in database. I do not wish to create one pipeline for each excel sheet, rather it should be a dynamic one which would handle all excels. How can I achieve this?
datanerdcg
Apr 08, 2025 Place Azure Data Factory
84Views
0likes
0Comments
OData Connector for Dynamics Business Central
Hey Guys, I'm trying to connect Dynamics Business Central OData API in ADF but I'm not sure what I'm doing wrong here because the same Endpoint is returning data on Postman but returning an error in ADF LinkedService. https://api.businesscentral.dynamics.com/v2.0/{tenant-id}/Sandbox-UAT/ODataV4/Company('company-name')/Chart_of_Accounts
lovishsood1
Feb 24, 2025 Place Azure Data Factory
235Views
0likes
1Comment
How do I unpivot with schema drift enabled?
I have a source without a pre-defined schema. I derive each column name using a column pattern expression: Data preview shows what I expect (which is a file in a blob container): I then have a Select step that selects each column and renames 'Marker name' to 'Marker_name22': Data preview again shows what I expect (same columns with 'Marker name' renamed). Now in the unpivot step, I would like to ungroup by the 'Marker_name22' column and unpivot all other columns, but the 'Marker_name22' column is not available: I am unsure how to proceed from here. Thanks in advance for the help.
big_ozzie
Feb 13, 2025 Place Azure Data Factory
110Views
0likes
1Comment