SOLVED

Losing data from PowerQuery Merge

Copper Contributor

Hello,

 

I've managed to combine the data from two of my sheets. My second sheet get updated on a daily basis and what it does is changing some statuses in my Query sheet. However, when a row is deleted from the second sheet, the last updated information is also gone. How may I prevent this from happening? I need it to stay with the latest information (or if needed when I manually add some information, which is no longer available in the second sheet it should not be overwrited and placed as blank cell).

 

Thank you,

Alex

 

14 Replies

@SaintEnt PQ can indeed combine data that exists in multiple tables. When you delete a row from one table, then it will no longer be included in the combined data. What else do you expect? Perhaps I misunderstood. If so, can you upload a sample of your data and the query you created?

@Riny_van_Eekelen 

I believe you got the idea. It's to update status of an order (so I can't actually provide the sheets).

So is there any way to be able to use power query to update the sheet, without losing data if the sheet I'm getting the newest information does not have the old info? Tried with different kind of merging, however, all unsuccessful.  

best response confirmed by SaintEnt (Copper Contributor)
Solution

@SaintEnt Okay! So you get an updated file every day and it happens that some orders no longer appear. Perhaps you need to apply a few extra steps to create a "new first" table. Tried to demonstrate it in the attached sheet. Perhaps a real PQ expert has a better idea. 

 

@Riny_van_Eekelen 
Looks like it works with Full outer join, perhaps the info in my data set has some difference in the data itself. As I tried it earlier, it got me some empty rows, will test it and will see if it works tomorrow morning.

Thank you!

@Riny_van_Eekelen 

I got the idea, managed to duplicate it on your sheet, however, the issue with my work on file is that

my values in the beginning are nulls, after I merge my query with the newest information they receive some value, which is dependent on the updaters file(let's call it like this) and when the updaters file removes the old information - it all gets back to nulls. Afterwards, when the conditional columns start working, the whole query refreshes and it becomes nulls again.

 

Any ideas?

@SaintEnt Difficult to visualise the issue here. Instinctively, I would say that you delete the null rows before merging. Can't you send a mock-up of the two data sets that need to be merged, indicating how the end result should look like. just few rows and columns. It doesn't have to contain any real data.

 

@Riny_van_Eekelen 

Hello again,

 

I've made up some information:

First, you will see the original data, with only OrderID. On table Date1Oct (marked with the light blue as well), is the information originally entered on the said date. After that I receive the new information, which is Date2Oct table (but actually just replacing the information - as my original files have the same rows/columns), and place Date2Oct on the Date1Oct, you will see how the nulls go back to the table.

And will get some examples because my explanation might be confusing ---> 

Info from Oct1

First.png

Info from Oct2

Second.png

The rows became nulls again, as there is no hardcopy of the data and I'm trying to figure a way around that.

@SaintEnt See attached. Unless I totally misunderstand, you must use the output of the Merged table as the input for the next day. I believe you'll understand what I mean if you look at the file.

 

So, the OrderTable = Day 1, and the DayTrans table = Day 2

The Merged table = everything from Day 2 plus all from Day 1 if Day 2 contained null.

 

@Riny_van_Eekelen 

Your OrderTable becomes the hardcopy from the Merge as I see, correct?

I'm trying to accomplish the same, but now I'm having some difficulties with the Macros, as my OrderTable will increase on a daily basis as well, not just the Daily information.

What I'm trying to do with Macro is to have an another sheet, which gets cleared, gets the information from the values from the latest Merge and then acts like the info gathering sheet.

 

 

@SaintEnt 

As variant

Very first time you generate the query which merges empty orders with current date.

With that within that query (not as separate one) you query returned table which gives previous results, merge it with current data in date table, if date table has no data for some fields keep previous data, otherwise keep previous data.

Kind of loop iterating query on itself. With that you keep only one date table updating information in it from time to time and refreshing result.

This part script is

let
    Source = Table.NestedJoin(
        OrderTable,
        {"OrderID"}, Date1Oct,
        {"OrderID"}, "Date1Oct",
        JoinKind.FullOuter
    ),
    #"We dont need initial data" = Table.SelectColumns(
        Source,
        {"OrderID", "Date1Oct"}
    ),
    #"Take data from Date" = Table.ExpandTableColumn(
        #"We dont need initial data",
        "Date1Oct",
        {"Status", "ETA", "PID"},
        {"Status", "ETA", "PID"}
    ),

    #"Get Previous Data" = Excel.CurrentWorkbook(){[Name="FillFromDate"]}[Content],
    #"Merge it with current date" = Table.NestedJoin(
        #"Get Previous Data",
        {"OrderID"}, Date1Oct, {"OrderID"},
        "Previous", JoinKind.LeftOuter
    ),
    #"Expanded Previous data" = Table.ExpandTableColumn(
        #"Merge it with current date",
        "Previous",
        {"Status", "ETA", "PID"},
        {"Previous.Status", "Previous.ETA", "Previous.PID"}
    ),
    #"Combine Status" = Table.AddColumn(
        #"Expanded Previous data",
        "StatusNew",
        each
            if [Status] = null
            then [Previous.Status]
            else [Status]
    ),
    #"Combine ETA" = Table.AddColumn(
        #"Combine Status",
        "ETAnew",
        each
            if [ETA] = null
            then [Previous.ETA]
            else [ETA]
    ),
    #"Combine PID" = Table.AddColumn(
        #"Combine ETA",
        "PIDnew",
        each
            if [PID] = null
            then [Previous.PID]
            else [PID]
    ),
    #"Removed Other Columns" = Table.SelectColumns(
        #"Combine PID",
        {"OrderID", "StatusNew", "ETAnew", "PIDnew"}
    ),
    #"Renamed Columns back" = Table.RenameColumns(
        #"Removed Other Columns",
        {
            {"StatusNew", "Status"},
            {"ETAnew", "ETA"},
            {"PIDnew", "PID"}
        }
    ),
    #"Better to have proper types" = Table.TransformColumnTypes(
        #"Renamed Columns back",
        {
            {"OrderID", Int64.Type},
            {"Status", type text},
            {"ETA", type date},
            {"PID", type text}
        }
    )
in
    #"Better to have proper types"

@Sergei Baklan 

Hello Sergei,

 

Thank you for the long codding that you've done, however, the query should be able to overwrite the new updates (example as you can see on Date 10/2, OrderId5 has a new Status - Finished.

As we are refreshing the query, the null values no longer appear, however, they do not update.

 

Is there anything to be done about this, as I'm quite new to the coding in excel (just from youtube videos).

 

Kind regards,

Alex

@SaintEnt 

Alex, I see. Change the logic, please see in attached. Script is

let
    Source = Table.NestedJoin(
        OrderTable,
        {"OrderID"}, Date1Oct,
        {"OrderID"}, "Date1Oct",
        JoinKind.FullOuter
    ),
    #"We dont need initial data" = Table.SelectColumns(
        Source,
        {"OrderID", "Date1Oct"}
    ),
    #"Take data from Date" = Table.ExpandTableColumn(
        #"We dont need initial data",
        "Date1Oct",
        {"Status", "ETA", "PID"},
        {"Status", "ETA", "PID"}
    ),

    #"Get Previous Data" = Excel.CurrentWorkbook()
        {[Name="FillFromDate"]}[Content],
    #"Merge it with current date" = Table.NestedJoin(
        #"Get Previous Data",
        {"OrderID"}, Date1Oct,
        {"OrderID"}, "New",
        JoinKind.LeftOuter
    ),
    #"Expanded New data" = Table.ExpandTableColumn(
        #"Merge it with current date",
        "New",
        {"Status", "ETA", "PID"},
        {"New.Status", "New.ETA", "New.PID"}
    ),
    #"Combine Status" = Table.AddColumn(
        #"Expanded New data",
        "StatusNew",
        each
            if [New.Status] <> null
            then [New.Status]
            else [Status]
    ),
    #"Combine ETA" = Table.AddColumn(
        #"Combine Status",
        "ETAnew",
        each
            if [New.ETA] <> null
            then [New.ETA]
            else [ETA]
    ),
    #"Combine PID" = Table.AddColumn(
        #"Combine ETA",
        "PIDnew",
        each
            if [New.PID] <> null
            then [New.PID]
            else [PID]
    ),
    #"Removed Other Columns" = Table.SelectColumns(
        #"Combine PID",
        {"OrderID", "StatusNew", "ETAnew", "PIDnew"}
    ),
    #"Renamed Columns back" = Table.RenameColumns(
        #"Removed Other Columns",
        {
            {"StatusNew", "Status"},
            {"ETAnew", "ETA"},
            {"PIDnew", "PID"}
        }
    ),
    #"Better to have proper types" = Table.TransformColumnTypes(
        #"Renamed Columns back",
        {
            {"OrderID", Int64.Type},
            {"Status", type text},
            {"ETA", type date},
            {"PID", type text}
        }
    )
in
    #"Better to have proper types"

@Sergei Baklan Thank you,

I did managed to do it the old school way, but still I do appreciate your efforts. 

@SaintEnt 

Sure, use more familiar way, with that most probably you'll save time on maintenance.

 

Thank you for the update.

1 best response

Accepted Solutions
best response confirmed by SaintEnt (Copper Contributor)
Solution

@SaintEnt Okay! So you get an updated file every day and it happens that some orders no longer appear. Perhaps you need to apply a few extra steps to create a "new first" table. Tried to demonstrate it in the attached sheet. Perhaps a real PQ expert has a better idea. 

 

View solution in original post