SOLVED

In Search of Advanced Automated Formula to Condense rows while applying Sum

Copper Contributor

I have several monthly reports I must submit that require very similar processes to be done to the excel data I generate from software I use. I have to print out the pounds of food sold to customers for the month, add the values up, and report each item back. The time spent making tweaks and changes is exhausting, and I know it can be automated based off of "if cases" and data validation, I just need help figuring out how.

I receive my data as a long list (Usually 2,000 rows) of data looking like this:

PRODUCTCUSTOMERQUANTITY SOLD
103 MEATAGENCY_123 lbs
103 MEATAGENCY_117 lbs
103 MEATAGENCY_14 lbs
100 BAKERYAGENCY_117 lbs
100 BAKERYAGENCY_131 lbs
105 NON-FOODAGENCY_17 lbs
106 PRODUCEAGENCY_116 lbs
103 MEATAGENCY_29 lbs
103 MEATAGENCY_218 lbs

 

This data continues on as we have data for over 30 Agency customers each month. My first few tasks for cleaning the up the data includes highlighting Column C and selecting "Find & replace" and finding " lbs" and replacing with " " so that any math addition calculates correctly since the string value has been removed. Next, I need to provide at least one row of space between each Agency for help with visualizing data. I typically scroll through the whole list and right click "Insert..." to add a blank row between the last value of a Customer and the first value of the next Customer. Finally, I need to then add all of the pounds for each item, which requires me to add all values of one Product, overwrite the first cell entry of that Product, and then highlighting and deleting all rows below with the same Product so that I only have one row line of that food Product showing for the Agency.

I know this is a large and advanced problem, but I thoroughly believe there can be a way to automate this process using code to go through these checks and analyzations to create an optimal report every time. Any help is immensely appreciated, thank you! :lol:

12 Replies
best response confirmed by Hans Vogelaar (MVP)
Solution

@FoodbankYS 

Check out the attached sheet.

I'd be happy to demonstrate this with a more complete copy, if you are able without violating confidentiality, to post a copy of your actual worksheet. In the absence of that, the following steps were all that I needed to take.

  • I used one formula that converted your Quantity to just the values. 

=VALUE(LEFT([@[QUANTITY SOLD]],FIND(" ",[@[QUANTITY SOLD]])))

  • Then I  let Excel do the crunch work by means of the Pivot Table, which is one of the most popular (and powerful) tools built into Excel for summarizing the kind of data you have.

mathetes_0-1691604828663.png

 

If you can post a more extensive example of your raw data, do so either here or on OneDrive or GoogleDrive, pasting a link here in the latter two instances that grants access.

@FoodbankYS 

If like this

image.png

it could be done by Power Query. Generated script is

let
    Source = Excel.CurrentWorkbook(){[Name="report"]}[Content],
    removePounds = Table.ReplaceValue(Source,"lbs","",Replacer.ReplaceText,{"QUANTITY SOLD"}),
    trimQty = Table.TransformColumns(
        removePounds,
        {{"QUANTITY SOLD", Text.Trim, type text}}),
    declareType = Table.TransformColumnTypes(
        trimQty, {
            {"PRODUCT", type text},
            {"CUSTOMER", type text},
            {"QUANTITY SOLD", type number}
        }),
    #"Grouped Rows" = Table.Group(
        declareType, {"PRODUCT", "CUSTOMER"},
        {{"Qty", each List.Sum([QUANTITY SOLD]), type nullable number}}),
    names = Table.ColumnNames( #"Grouped Rows" ),
    blankTable = #table( names, { {null,null,null} } ),
    groupAgency = Table.Group(
        #"Grouped Rows", {"CUSTOMER"},
        {{"Data", each _, type table [PRODUCT=nullable text, CUSTOMER=nullable text, Qty=nullable number]}}),
    addBlanks = Table.AddColumn(
        groupAgency,
        "Custom", each [Data] & blankTable),
    keepData = Table.SelectColumns(addBlanks,{"Custom"}),
    expandTables = Table.ExpandTableColumn(
        keepData,
        "Custom",
        {"PRODUCT", "CUSTOMER", "Qty"}, {"PRODUCT", "CUSTOMER", "Qty"})
in
    expandTables

@FoodbankYS 

That is to repeat @mathetes approach with PivotTable based on data model using measure

Qty Sold:=SUMX( report, SUBSTITUTE( 1*report[QUANTITY SOLD], "lbs", "" )  ) 

image.png

Good afternoon,

Thank you for the lovely response. I'm trying to use this on my end and I have a follow-up question based off this code. For the Source, you're assigning it the Table 'report', but does that imply that I need to convert my whole excel file to a Table prior to all of this compiling? The data from my software program exports as rows of data, but not pre-assigned in Table form. Thank you.

@FoodbankYS 

With Power Query there are few options. If you query data from the same workbook it shall be named. Data have name if you transfer it into structured table (as in the sample) or you name the range and query but that name. Here is similar but bit different technique.

 

Perhaps more optimal is to query from another file, when in source file you shall change nothing. In file with such report you may use Data->Get Data->From file->From Excel workbook connector. Even more, you may use From Folder connector. If you have several source reports and would like to combine them into one cleaned report, put all source files into one folder and use that connector. That's important that all files have the same structure. Even with one periodically updated file you may use such connector, with that it's not necessary to care about file name.

For the above entire procedure is more complex, but not dramatically. 

@FoodbankYS 

For the PivotTable you may simply select all data and Insert->PivotTable->From table/Range. Not necessary to name it or convert to structured table.

Yes, I have now figured out how to refine the data in a concise manner via PivotTable. I struggled a bit to utilize Power Query Editor, but I think at this point I might have been attempting to reinvent the wheel with that route. I was able to create a PivotTable and assign Columns "CUSTOMER" and "PRODUCT" in the "Rows" field box, and then set the Values field box to "Sum of QUANTITY SOLD". I truly appreciate the help and support you provided, along with other community members' support; this makes my monthly reporting of data much easier due to a more streamlined process. Have a wonderful day!
Thank you for your response. I played around with PivotTable more, and I was able to get my table looking very much like yours. I placed the columns "CUSTOMER" and "PRODUCT" in the Rows field box, and I set the Values field box to "Sum of QUANTITY SOLD" and the data has been refined to be concise and without the formatting I was trying to remove. I really appreciate your help and support as this makes my work much easier to perform now. Have a wonderful day!
You're very welcome. The Pivot Table has for about three decades, maybe more, one of the most useful tools Excel (and Lotus 1-2-3 before) offered. It does the heavy lifting that used to require a lot more work on the part of the user. All you need is a well organized table of data.....and some idea of how you want the data tabulated.

@FoodbankYS 

At the risk of becoming something of a 'one trick pony', since Lambda came out, there are now no calculations that cannot be performed by worksheet formulas.  In this case I used

= SumByAttributeλ(quantitySold, attributes)

image.png

The smaller tables to the right, using the following formulas,

= SumByAttributeλ(quantitySold, product)

= SumByAttributeλ(quantitySold, customer)

are just for the sake of showing that the Lambda function can be reused.  The functions I defined are

SumByAttributeλ
= LET(
    distinct,   UNIQUE(attributes),
    quantities, VALUE(TEXTBEFORE(quantitySold, " ")),
    summed,     BYROW(distinct, SumIfsλ(quantities, attributes)),
    HSTACK(distinct, summed & " lbs")
 )

SumIfsλ
= LAMBDA(distinct, 
    LET(
      ANDλ, LAMBDA(x, AND(x)), 
      SUM(FILTER(qty, BYROW(attr = distinct, ANDλ)))
    )
  )

Of course, because something can be done, it doesn't necessarily follow that it should be done!

 

 

@Peter Bartholomew 

 

Of course, because something can be done, it doesn't necessarily follow that it should be done!

 

Amen! [smiley face]

@FoodbankYS 

Glad we could help. Cosmetic addition to @mathetes solution. In general you may work without the formula, instead select entire QUANITY SOLD column, on ribbon Data->Text to Columns, select Space as delimiter on second space and Finish.

image.png

It splits column on two, one with numbers

Select first three columns, insert PivotTable without adding data to data model

image.png

Drag fields in PivotTable pane and you are ready

image.png

The rest is cosmetic.

Uncheck blanks for the labels filter

image.png

When you stay on PivotTable on ribbon Design section has 4 drop-down menus in the layout

image.png

From left to right select:

Subtotals->Do Not Show Subtotals

Grand Totals ->Off for Rows and Columns

Report layout->Show in Tabular Form

Report layout->Repeat All Item labels

Blank Rows->Insert Blank Line after Each Item

 

In another section here, PivotTable Styles, select one which you prefer.

 

Here unselect these two buttons

image.png

 

On ribbon View->Show uncheck Gridlines button.

 

Finally we have

image.png

 

All together is few minutes job (after you will repeat the same several times). That's quite simple but not optimal solution. All depends on goals. Each solution has pros and cons.