Power Query to CSV column order changes effects references to Power Query

Occasional Contributor

I have what I consider a very bad behavior for references to a Power query when the order of the CSV files the query is based on changes from one refresh to the next, the references to the power query mysteriously change and reference a different column.  How can this be, anyone know a work around?

 

Here is a simple example to demonstrate my issue (that I could not believe would happen).

 

1. Create CSV file with some sample data

 

Column1, Column2, Column3

Find_This, This_2, This_3

Find_That, That_2, That 3

 

2. Load CSV using Power Query and name the power query Are_You_Kidding_me (name not important)

 

3. In a new sheet, make a reference to the power query using Index/match, vlookup, or xlookup

Example:  XLOOKUP("Find_This", Are_You_Kidding_me[Column1], Are_You_Kidding_me[Column2], "Not Found", 0, 1)

 

4.  In my example the formula should return This_2, the column2 value where column1 contains the string Find_This.

 

5. Change the order of your of columns in your CSV file

Column1, Column3, Column2

Find_This, This_3, This_2

Find_That, That 3, That_2

 

6. Refresh your power query

 

7.  Note the order of the columns displayed in the power query change to match the new CSV file.  No big deal, the headers are correct still, right.

 

8.  Go to the other tab and inspect the formula you wrote.  What do you see?  Are you kidding me??

 

WHY DID MY FORMULA CHANGE?????

 

You should see that the formula now mysteriously references Column3 as the return value when all you did was refresh the power query.

 

XLOOKUP("Find_This", Are_You_Kidding_me[Column1], Are_You_Kidding_me[Column3], "Not Found", 0, 1)

 

Somehow behind the scenes it appears Microsoft ties the Power query reference to a specific column number in the power query and is NOT finding the column referenced (as I would expect)

 

I know my CSV file changes, it is out of my control.  I want Power Query to load the CSV, label the data by the headers and find the column I reference by the header name and return the related value.  This process works until the columns change, then all my formulas mysteriously reference who knows what column based on how the CSV file changes.

 

I hope I'm missing something and there is a simple workaround that FIXES this terrible behavior.

9 Replies

Hi @johnlockard 

 

I tried to replicate your problem but it works fine on my side. The XLOOKUP does not change after a refresh.

 

But I have a suspect: Do you use the step "Promote Headers" in your query?

Martin_Weiss_0-1658410194613.png

As you can see, for the second file, the header names from the CSV are not identical with the current "real" header names.

If you do use "Use First Row as Headers", this will change the headers to whatever comes from your CSV  and your formula should work fine:

Martin_Weiss_1-1658410334384.png

 

 

@Martin_Weiss Thanks for the suggestion.  I should have used better column names in my example to not confuse with the default column names.  I do have Promoted Headers in my power query and I see  this same behavior.

 

I also have other Renamed Columns as wells as added columns in my Power Query. I'll have to try to figure out the combination that causes this behavior.

@Martin_Weiss  Below is my full list of Power Query Steps that produce the behavior.  I tried an enhanced test that added renamed columns and I have not added complexity that reproduces the behavior I'm experiencing with the steps below.

johnlockard_0-1658412887535.png

 

@johnlockard 

The question is are

Column1, Column3, Column2

part of csv file and what are your settings here

image.png

@johnlockard 

You have 6 Rename Columns steps, what exactly you are doing here?

Sorry for the confusion on the headers. I should have had the same CSV file specify a better row 1 like the following:

Header1, Header2, Header3
Find_This, This_2, This_3
Find_That, That_2, That 3

The first row in the CSV has the headers and is promoted to be the headers for the Power Query.

My option setting is likely the default: Detect column types and headers for unstructured sources according to each file's settings

The steps that show as renamed columns are actually creating new columns by parsing out data from another column.

Table.RenameColumns(#"DB Type Delimiters",{{"Text Between Delimiters", "Type"}})

@johnlockard 

That's better to have entire sample file. If reproduce steps from your initial post - all woks, no problems. 

My bad for over simplifying what I thought was happening and complaining about something that actually works. I can't post the actual file and I'm working to make my example progressively more complex to determine what step(s) change the behavior to the unwanted behavior.