SOLVED

Creating a powerquery to fix a table

Iron Contributor

I am trying to make a query step to change the left example to the right example. The problem is that the client keeps sending them table with new rows that have updated results rather than just updating the previous row 

 

Notice records R779, R365 and R033

 

Left is actual table, right is how I want it after a query step

 

R654(blank)PENDINGPENDING(blank) R654(blank)PENDINGPENDING(blank)
R779(blank)PENDINGPENDING(blank) R779NoPENDINGPENDING(blank)
R779No(blank)(blank)(blank) R303(blank)PENDINGPENDING(blank)
R303(blank)PENDINGPENDING(blank) K628(blank)PENDINGPENDING(blank)
K628(blank)PENDINGPENDING(blank) R635YesPENDINGPENDING(blank)
R635(blank)PENDINGPENDING(blank) K640(blank)PENDINGPENDING(blank)
R635Yes(blank)(blank)(blank) R033NoCONFIRMEDCONFIRMEDCONFIRMED
K640(blank)PENDINGPENDING(blank) R177(blank)PENDINGPENDING(blank)
R033(blank)CONFIRMEDCONFIRMED      
R033No(blank)(blank)CONFIRMED     
R177(blank)PENDINGPENDING(blank)      
7 Replies
best response confirmed by Ocasio27 (Iron Contributor)
Solution

@Ocasio27 

Not sure there is the source data - in another file or in Table/range of the current file. In general that's doesn't matter, just to build the sample

image.png

Let select columns from A to E and name selection as Range. Query it.

Add index column to the table.

Sort on first column to combine same ID:s together, right after that sort on Index in descending order to keep the latest records first.

Add one more index column to fix result in memory (or wrap previous step with Table.Buffer() ).

Select first column, Remove duplicates.

Remove bot columns with indexes.

Load result back into the sheet.

@Ocasio27 Unless I have misunderstood your requirement isn't is so that you want to merge records for non-blank and updated fields?

For instance, for R033 the first record has: [blank, CONFIRMED, CONFIRMED, blank]

The second record for R033 has: [No, blank, blank, CONFIRMED]

 

Your desired output suggests that you want the result to be:

[No, CONFIRMED, CONFIRMED, CONFIRMED]

 

@Sergei Baklan 's solution takes the second record as the result, if I'm not mistaken. Inspired by his approach of double indexing, I added a few steps that unpivots the table first, and re-pivots it a few steps later, resulting in the desired output as you described, though the sorting order changed a bit.

Screenshot 2020-05-28 at 08.10.49.png

 

@Riny_van_Eekelen 

More exactly, not the second but the latest appeared in the list. Second index is only to fix in memory in UI without adding Table.Buffer manually.

@Sergei Baklan True! Should have written "latest record". But in this example the second record = the latest record, as there were only two occasions of R033.

Will have to look into the Table.Buffer as I haven't come across that one yet in my discoveries of PQ.

@Sergei Baklan True! Should have written "latest record". But in this example the second record = the latest record, as there were only two occasions of R033. I would have written "the third record" if there had been three.

 

Will have to look into Table.Buffer as I haven't come across that one yet in my discoveries of PQ.

@Riny_van_Eekelen 

There is a possibility of a third, even forth record (they really want to make things hard for me).

 

Will the code work anyway?

@Ocasio27 Tested it with your example data. It doesn't matter how many records you have per item. However, once a field has received a value other than (blank), the current query does not allow it to be reset to (blank), as it is interpreted as "no change" compared with the previous record. Perhaps possible, but then you'll have to tweak the query. And of course, you'll still have to adapt the query to your real data range, possibly with headers.

1 best response

Accepted Solutions
best response confirmed by Ocasio27 (Iron Contributor)
Solution

@Ocasio27 

Not sure there is the source data - in another file or in Table/range of the current file. In general that's doesn't matter, just to build the sample

image.png

Let select columns from A to E and name selection as Range. Query it.

Add index column to the table.

Sort on first column to combine same ID:s together, right after that sort on Index in descending order to keep the latest records first.

Add one more index column to fix result in memory (or wrap previous step with Table.Buffer() ).

Select first column, Remove duplicates.

Remove bot columns with indexes.

Load result back into the sheet.

View solution in original post