Jul 21 2022 05:36 AM
I have what I consider a very bad behavior for references to a Power query when the order of the CSV files the query is based on changes from one refresh to the next, the references to the power query mysteriously change and reference a different column. How can this be, anyone know a work around?
Here is a simple example to demonstrate my issue (that I could not believe would happen).
1. Create CSV file with some sample data
Column1, Column2, Column3
Find_This, This_2, This_3
Find_That, That_2, That 3
2. Load CSV using Power Query and name the power query Are_You_Kidding_me (name not important)
3. In a new sheet, make a reference to the power query using Index/match, vlookup, or xlookup
Example: XLOOKUP("Find_This", Are_You_Kidding_me[Column1], Are_You_Kidding_me[Column2], "Not Found", 0, 1)
4. In my example the formula should return This_2, the column2 value where column1 contains the string Find_This.
5. Change the order of your of columns in your CSV file
Column1, Column3, Column2
Find_This, This_3, This_2
Find_That, That 3, That_2
6. Refresh your power query
7. Note the order of the columns displayed in the power query change to match the new CSV file. No big deal, the headers are correct still, right.
8. Go to the other tab and inspect the formula you wrote. What do you see? Are you kidding me??
WHY DID MY FORMULA CHANGE?????
You should see that the formula now mysteriously references Column3 as the return value when all you did was refresh the power query.
XLOOKUP("Find_This", Are_You_Kidding_me[Column1], Are_You_Kidding_me[Column3], "Not Found", 0, 1)
Somehow behind the scenes it appears Microsoft ties the Power query reference to a specific column number in the power query and is NOT finding the column referenced (as I would expect)
I know my CSV file changes, it is out of my control. I want Power Query to load the CSV, label the data by the headers and find the column I reference by the header name and return the related value. This process works until the columns change, then all my formulas mysteriously reference who knows what column based on how the CSV file changes.
I hope I'm missing something and there is a simple workaround that FIXES this terrible behavior.
Jul 21 2022 06:32 AM
Hi @johnlockard
I tried to replicate your problem but it works fine on my side. The XLOOKUP does not change after a refresh.
But I have a suspect: Do you use the step "Promote Headers" in your query?
As you can see, for the second file, the header names from the CSV are not identical with the current "real" header names.
If you do use "Use First Row as Headers", this will change the headers to whatever comes from your CSV and your formula should work fine:
Jul 21 2022 06:48 AM
@Martin_Weiss Thanks for the suggestion. I should have used better column names in my example to not confuse with the default column names. I do have Promoted Headers in my power query and I see this same behavior.
I also have other Renamed Columns as wells as added columns in my Power Query. I'll have to try to figure out the combination that causes this behavior.
Jul 21 2022 07:17 AM
@Martin_Weiss Below is my full list of Power Query Steps that produce the behavior. I tried an enhanced test that added renamed columns and I have not added complexity that reproduces the behavior I'm experiencing with the steps below.
Jul 21 2022 07:46 AM
The question is are
Column1, Column3, Column2
part of csv file and what are your settings here
Jul 21 2022 07:56 AM
You have 6 Rename Columns steps, what exactly you are doing here?
Jul 21 2022 08:11 AM
Jul 21 2022 08:15 AM
Jul 21 2022 08:19 AM
That's better to have entire sample file. If reproduce steps from your initial post - all woks, no problems.
Jul 21 2022 08:26 AM