Data Flows

6 Topics

How do I create a flow that adapts to new columns dynamically?
Hello, I have files landing in a blob storage container that I'd like to copy to a SQL database table. The column headers of these files are date markers, so each time a new file is uploaded, a new date will appear as a new column. How can I handle this in a pipeline? I think I'll need to dynamically accept the schema and then use an unpivot transformation to normalize the data structure for SQL, but I am unsure how to execute this plan. Thanks!
big_ozzie
Jan 09, 2025 Place Azure Data Factory
79Views
0likes
0Comments
Clarification on Staging Directory Usage for SAP CDC Connector in Azure Data Factory
Hi! I'm currently working on a project where we are ingesting data from SAP using the SAP CDC connector in Azure Data Factory(Data flow). The source is S4HAHA CDS views. We are using a staging directory for the data flow with a checkpoint mechanism, similar to described here: https://learn.microsoft.com/en-us/azure/data-factory/connector-sap-change-data-capture My question is: Does the staging directory only act as a temporary storage location during ingestion from sap? If i understand correctly its used for retries, but no real usage once the deltas have been ingested. After the data has been loaded to the destination(in our case container inside of ADLS), is the data needed for maintaining delta states? Can the data be safely deleted(from the staging container) without impacting the subsequent load runs? We were thinking of implementing a 7 day retention policy on the staging container so we can manage storage efficiently. Thank you in advance for any information regarding this.
DiskoSuperStar
Dec 02, 2024 Place Azure Data Factory
76Views
1like
0Comments
Implement Fill Down in ADF and Synapse Data Flows
"Fill Down" is an operation common in data prep and data cleansing meant to solve the problem with data sets when you want to replace NULL values with the value from the previous non-NULL value in the sequence. Here is how to implement this in ADF and Synapse data flows.
Mark Kromer
Nov 02, 2022 Place Azure Data Factory Blog
6.7KViews
1like
2Comments
Data flow sink supports user db schema for staging in Azure Synapse and PostgreSQL connectors
To achieve the fastest loading speed for moving data into a data warehouse table, load data into a staging table. Consider that loading is usually a two-step process in which you first load to a staging table and then insert the data into a production data warehouse table. Loading to the staging table takes longer, but the second step of inserting the rows to the production table does not incur data movement across the distributions. Data flow sink transformation supports staging. By default, a temporary table will be created under the sink schema as staging. For Azure Synapse Analytics and Azure PostgreSQL, you can alternatively uncheck the Use sink schema option and instead, specify a schema name under which Data Factory will create a staging table to load upstream data and automatically clean them up upon completion. Make sure you have create table permission in the database and alter table permissions on the schema. Please follow links below for more details. User db schema for staging in Azure Synapse Analytics User db schema for staging in Azure PostgreSQL
Sunil_Sabat
Apr 29, 2022 Place Azure Data Factory Blog
4.7KViews
1like
0Comments
Empty File is getting created in ADF
I have a ADF pipeline, which has data flows. The data flows reads excel file and puts the records to a SQL DB. The incorrect records are pushed to a Sink of Blob Storage as CSV File. When all the records are correct and empty. csv file is getting created and pushed to Blob. How can I avoid creation of this empty file.
Chinmay89
Aug 08, 2021 Place Azure Data Factory
1.7KViews
0likes
0Comments
Dataflow
Hi, Urgently need help - how to read 120gb (3.4billion rows from a table) at lightening data from azure SQL server database to azure data Lake. I tried to two options: Copy activity with parallelism and highest DIU - this gives time out error after long running hours Data flow - this takes 11 hours long time to read data Please suggest
Sonalk
Feb 05, 2021 Place Azure SQL
811Views
0likes
0Comments