Avoid column-skew if data-file is missing few columns with Copy Data or DataFlow Activity

I tried using Copy Data Activity to load the data from a list of files coming from separate sources to a target table. In my case, the columns present in the target table is a union (superset) of the columns from the customer-sites. E.g. If Source1 has Column_A, Column_B, Column_D whereas Source2 has Column_A, Column_C, Column_D the target schema would have Column_A, Column_B, Column_C, Column_D.

Not surprisingly, I encountered a pipeline failure due to the column-skew when data-files from both sources are processed together.

I tried to use a DataFlow, thinking that mapping drifted fields (byName()) will be able to handle the difference in schema, but still I see field-values getting mapped incorrectly due to left-shifting of the column-values where the columns are absent in the data-file. E.g. In above example, where Source1 has Column_A, Column_B, Column_D as the header-record and Source2 has Column_A, Column_C, Column_D in the header, target table will get the values:

Column_A <- Column_A

Column_B <- Column_C (for records from Source2)

Column_C <- Column_D (for records from both Source1 and Source2)

Is this how DataFlow is supposed to work by design? OR, am I missing something when configuring the DataFlow? How do I get the field-values to map to the appropriate columns? Please suggest.

azure data factory

Copy Activity

Mapping Data Flows

Forum Discussion

Avoid column-skew if data-file is missing few columns with Copy Data or DataFlow Activity