Column Mapping Improvements in Denali

Published Mar 25 2019 03:16 PM 144 Views
Not applicable
First published on MSDN on Aug 08, 2011

We made a number of changes in Denali around the use of lineage IDs, and auto-mapping of metadata. Collectively, these improvements were internally called “Flexible Order of Authoring” (because the idea was that you could build a package backwards – from destination to source).

Take the following example (which I walked through in my Denali presentation at TechEd if you’d prefer this in video format…):

This package has a Raw File Source reading from file containing customer data. The metadata looks like this:

Notice the FirstName and LastName columns.

Metadata mapping errors vs. component errors

If I delete the path connecting the source to the Copy Column transform, my package is put into an error state. Notice where the error icons appear:

In Denali, we distinguish between errors caused by column mappings, and errors on the component itself. When an error is due to a mapping problem (i.e. missing columns), the error icon will be display on the path, or above the transform if no path is connected. Another change is that the component itself can still be edited - the component actually caches the metadata from it’s last working state. If I double click on it, it’s UI appears, and I am able to make changes.

Smarter Lineage IDs

The Data Flow in Denali still makes use of Lineage IDs (unique integer values assigned to each column in the data flow), but the designer is now a lot smarter about remapping them when a new data source in connected.

In my package, I have a second Raw File Source (called Alternative Source). It is reading a file that is similar to the other source component, but in the column mappings, I’ve renamed two of the columns.

Noticed that FirstName and LastName are now output as GivenName and FamilyName .

In previous versions of SQL Server I would get a number of mapping errors if I connect a new source component to existing downstream components because of the change in Lineage IDs. While the remapping process was improved in 2008, you still needed to fixup the columns coming in from the new source. In Denali we’ve improved the remapping functionality by automatically remapping columns based on their name (and data type), instead of by lineage ID.

After connecting the Alternative Source to the Copy Column transform, we see the error icon is moved to the path.

Double clicking the error brings up the new Resolve References UI.

Resolve References

This new UI was originally developed for the Parallel Data Warehouse destination (where it is common to have hundreds of columns per table). In Denali it’s been integrated into the main product to become our column reference resolving UI. It has a number of features, including:

  • Auto-map based on name/type
  • Filtering columns by name
  • Import column mappings from Excel
  • Export column mappings to Excel

In this case, all columns that had the same names were automatically remapped by the designer. The names of the columns I had renames were not mapped, but I can easily fixed up the mappings myself.

Another thing to note is that when the Resolve References UI is launched, you’re not just fixing a single component. The operation will attempt to resolve the mappings for the entire execution tree . This execution tree will be highlighted in the designer while the Resolve References UI is open.

After completing the UI, everything is reconnected, and the errors go away.

And of course, the great thing about this demo is that I can Undo things back to the way they were when I started, and the package is ready to demo once again!

That’s it for the column mapping improvements – we hope you’ll enjoy these and the other new features and functionality in Denali .

Version history
Last update:
‎Mar 25 2019 03:16 PM
Updated by: