Combine row based on criteria (Product Package)

Copper Contributor

Hi everyone,

 

I am currently doing data analysis for my restaurant and got stuck in a process.

 

I have set of data below:

 

Sales NumberMenu CategoryMenu Category DetailMenuQtyPriceSubtotal
165932508771FoodSSSSL16160061600
165932508771ExtrasSSL (PACKAGE)100
165932508771ExtrasONJ11080010800
165932508771BeverageTETS11870018700
165932508771ExtrasOKP145004500
165932508771ExtrasOBK125002500
165932665478FoodSSSSL16160061600
165932665478ExtrasSSL (PACKAGE)100
165932665478ExtrasONJ (PACKAGE)180008000
165932665478ExtrasOBK125002500

 

What i want to do is:

1. If menu have "PACKAGE" on its name, combine to the cell above that has no "PACKAGE" on its name

2. The output of the combination will be merge of menu name and sum of subtotal, everything else on the row can be deleted

 

Here is the example of the output i desired:

 

Sales NumberMenu CategoryMenu Category DetailMenuQtyPriceSubtotal
165932508771FoodSSSSL;SL16160061600
165932508771ExtrasONJ11080010800
165932508771BeverageTETS11870018700
165932508771ExtrasOKP145004500
165932508771ExtrasOBK125002500
165932665478FoodSSSSL;SL;NJ16960069600
165932665478ExtrasOBK125002500

 

My guess it can be done in power query, unfortunately i'm just beginner in that feature.

 

Please Help.

14 Replies

@Ryan_Izzan Not particularly proud of this solution (attached) as it looks a bit clumsy. But it seems to work on this small scale. See if it works for you in-real-life.

 

@Ryan_Izzan 

As an alternative, here is a macro solution.

Sub MergePackage()
    Dim r As Long
    Dim n As String
    Dim s As Double
    Dim p As Long
    Application.ScreenUpdating = False
    r = Range("D" & Rows.Count).End(xlUp).Row
    Do
        p = InStr(Range("D" & r), "(PACKAGE)")
        If p Then
            Do
                n = n & ";" & Left(Range("D" & r).Value, p - 1)
                s = s + Range("G" & r).Value
                Range("A" & r).Resize(1, 7).Delete Shift:=xlShiftUp
                r = r - 1
                If r = 1 Then Exit Do
                p = InStr(Range("D" & r), "(PACKAGE)")
            Loop Until p = 0
            Range("D" & r).Value = Range("D" & r).Value & n
            Range("G" & r).Value = Range("G" & r).Value + s
            n = ""
            s = 0
        End If
        r = r - 1
    Loop Until r = 1
    Application.ScreenUpdating = True
End Sub
Thank you for your response.

Sorry for late response, i just try it today but kinda stuck in "Group" step. it keeps loading and the size reach 1,7 GB and still increasing.

Its my fault that i didn't mention that the table has 23 column and 20.000+ rows. As you mention before, maybe it works on small scale.

Is there any way to solve this problem, or any better solution from my new information

Regards,

@Ryan_Izzan Difficult to say what's causing it. 20 thousand rows by 23 columns isn't all that spectacular.

Can you upload or share (Onedrive or similar) a larger and more realistic data set? Not all 20000 rows though but include all columns. Anonymize the data if needed.

@Riny_van_Eekelen 

Please find my dummy version of my data, hope that it can give another input to the solution

 

 

@Ryan_Izzan Having trouble downloading this file. Please make it smaller. A few hundred rows will do.

@Riny_van_Eekelen 

Here is the smaller one

@Ryan_Izzan Will get back to you later and see if I can get it to work on this larger scale.

@Riny_van_Eekelen Okay! I noticed some problems. Your initial example was clearly not reflection reality. The logic that worked on the example data does not work on your real data. One of the issues is that you have Sales Numbers where the very first Menu item contains the word PACKAGE.

Screenshot 2022-09-21 072535.png

The first rule was:

"1. If menu have "PACKAGE" on its name, combine to the cell above that has no "PACKAGE" on its name.

 

Obviously, you don't want to combine SS from sales number ...524 with the items above in ...861 (SSRB,NP and SL).

 

You could group the entire data set by Sales Number first and then apply the logic that worked on a smaller scale on each of these grouped tables. but you still have to clarify how a situation as in the picture below would have to be handled.

Riny_van_Eekelen_0-1663738754562.png

 

 

 

@Riny_van_Eekelen 

 

I'm so sorry,

 

When i check, the real data is still same with the condition i  mention.

 

The data change and the problem you mention just now happen because when i set the dummy data, i just replace word without considering this.

 

Here is some revised dummy data

 

Regards,

@Ryan_Izzan Okay! No problem. I'll see what I can do.

@Ryan_Izzan I added my original query to your file and revised the query a bit. It should works as intended now. But please test it thoroughly.

HI @Riny_van_Eekelen,

 

Just finished test it to real data and no problem with the power query, i can load it as table or pivot table.

 

But when i check the total of "Subtotal" of each month, it different compare to the real data.

 

It seems when it merge, it duplicate several row that have multiple match.

 

Any idea how we can solve this?

 

But anyway i really appreciate and thankful for all your help to me

 

Regards,

 

@Ryan_Izzan Ah! Omitted to the check. Indeed, you have several Sales numbers the have duplicate items that do not have (PACKAGE) in the Menu column. These lead to being grouped twice. Add one last step to the query that removes duplicates where columns Sales Number, Menu and Subtotal are the same.

So, at the very end, select these tree columns (by holding down Ctrl while selecting). Then right-click and select remove duplicates. The M-code generated looks like this:

Riny_van_Eekelen_0-1663921685669.png

Now, the totals should agree.