Forum Discussion

Sanchet_Dighe's avatar
Sanchet_Dighe
Copper Contributor
Sep 02, 2022

Avoid column-skew if data-file is missing few columns with Copy Data or DataFlow Activity

I tried using Copy Data Activity to load the data from a list of files coming from separate sources to a target table. In my case, the columns present in the target table is a union (superset) of the columns from the customer-sites. E.g. If Source1 has Column_A, Column_B, Column_D whereas Source2 has Column_A, Column_C, Column_D the target schema would have Column_A, Column_B, Column_C, Column_D.
 
Not surprisingly, I encountered a pipeline failure due to the column-skew when data-files from both sources are processed together.
 
I tried to use a DataFlow, thinking that mapping drifted fields (byName()) will be able to handle the difference in schema, but still I see field-values getting mapped incorrectly due to left-shifting of the column-values where the columns are absent in the data-file. E.g. In above example, where Source1 has Column_A, Column_B, Column_D as the header-record and Source2 has Column_A, Column_C, Column_D in the header, target table will get the values:
 
Column_A <- Column_A
Column_B <- Column_C (for records from Source2)
Column_C <- Column_D (for records from both Source1 and Source2)
 
Is this how DataFlow is supposed to work by design? OR, am I missing something when configuring the DataFlow? How do I get the field-values to map to the appropriate columns? Please suggest.
No RepliesBe the first to reply

Resources