Aug 09 2023 10:26 AM
I have several monthly reports I must submit that require very similar processes to be done to the excel data I generate from software I use. I have to print out the pounds of food sold to customers for the month, add the values up, and report each item back. The time spent making tweaks and changes is exhausting, and I know it can be automated based off of "if cases" and data validation, I just need help figuring out how.
I receive my data as a long list (Usually 2,000 rows) of data looking like this:
PRODUCT | CUSTOMER | QUANTITY SOLD |
103 MEAT | AGENCY_1 | 23 lbs |
103 MEAT | AGENCY_1 | 17 lbs |
103 MEAT | AGENCY_1 | 4 lbs |
100 BAKERY | AGENCY_1 | 17 lbs |
100 BAKERY | AGENCY_1 | 31 lbs |
105 NON-FOOD | AGENCY_1 | 7 lbs |
106 PRODUCE | AGENCY_1 | 16 lbs |
103 MEAT | AGENCY_2 | 9 lbs |
103 MEAT | AGENCY_2 | 18 lbs |
This data continues on as we have data for over 30 Agency customers each month. My first few tasks for cleaning the up the data includes highlighting Column C and selecting "Find & replace" and finding " lbs" and replacing with " " so that any math addition calculates correctly since the string value has been removed. Next, I need to provide at least one row of space between each Agency for help with visualizing data. I typically scroll through the whole list and right click "Insert..." to add a blank row between the last value of a Customer and the first value of the next Customer. Finally, I need to then add all of the pounds for each item, which requires me to add all values of one Product, overwrite the first cell entry of that Product, and then highlighting and deleting all rows below with the same Product so that I only have one row line of that food Product showing for the Agency.
I know this is a large and advanced problem, but I thoroughly believe there can be a way to automate this process using code to go through these checks and analyzations to create an optimal report every time. Any help is immensely appreciated, thank you!
Aug 09 2023 11:17 AM
SolutionCheck out the attached sheet.
I'd be happy to demonstrate this with a more complete copy, if you are able without violating confidentiality, to post a copy of your actual worksheet. In the absence of that, the following steps were all that I needed to take.
=VALUE(LEFT([@[QUANTITY SOLD]],FIND(" ",[@[QUANTITY SOLD]])))
If you can post a more extensive example of your raw data, do so either here or on OneDrive or GoogleDrive, pasting a link here in the latter two instances that grants access.
Aug 09 2023 11:31 AM
If like this
it could be done by Power Query. Generated script is
let
Source = Excel.CurrentWorkbook(){[Name="report"]}[Content],
removePounds = Table.ReplaceValue(Source,"lbs","",Replacer.ReplaceText,{"QUANTITY SOLD"}),
trimQty = Table.TransformColumns(
removePounds,
{{"QUANTITY SOLD", Text.Trim, type text}}),
declareType = Table.TransformColumnTypes(
trimQty, {
{"PRODUCT", type text},
{"CUSTOMER", type text},
{"QUANTITY SOLD", type number}
}),
#"Grouped Rows" = Table.Group(
declareType, {"PRODUCT", "CUSTOMER"},
{{"Qty", each List.Sum([QUANTITY SOLD]), type nullable number}}),
names = Table.ColumnNames( #"Grouped Rows" ),
blankTable = #table( names, { {null,null,null} } ),
groupAgency = Table.Group(
#"Grouped Rows", {"CUSTOMER"},
{{"Data", each _, type table [PRODUCT=nullable text, CUSTOMER=nullable text, Qty=nullable number]}}),
addBlanks = Table.AddColumn(
groupAgency,
"Custom", each [Data] & blankTable),
keepData = Table.SelectColumns(addBlanks,{"Custom"}),
expandTables = Table.ExpandTableColumn(
keepData,
"Custom",
{"PRODUCT", "CUSTOMER", "Qty"}, {"PRODUCT", "CUSTOMER", "Qty"})
in
expandTables
Aug 09 2023 11:44 AM
That is to repeat @mathetes approach with PivotTable based on data model using measure
Qty Sold:=SUMX( report, SUBSTITUTE( 1*report[QUANTITY SOLD], "lbs", "" ) )
Aug 09 2023 12:14 PM
Aug 09 2023 12:34 PM
With Power Query there are few options. If you query data from the same workbook it shall be named. Data have name if you transfer it into structured table (as in the sample) or you name the range and query but that name. Here is similar but bit different technique.
Perhaps more optimal is to query from another file, when in source file you shall change nothing. In file with such report you may use Data->Get Data->From file->From Excel workbook connector. Even more, you may use From Folder connector. If you have several source reports and would like to combine them into one cleaned report, put all source files into one folder and use that connector. That's important that all files have the same structure. Even with one periodically updated file you may use such connector, with that it's not necessary to care about file name.
For the above entire procedure is more complex, but not dramatically.
Aug 09 2023 12:38 PM
For the PivotTable you may simply select all data and Insert->PivotTable->From table/Range. Not necessary to name it or convert to structured table.
Aug 09 2023 01:47 PM
Aug 09 2023 01:51 PM
Aug 09 2023 02:05 PM
Aug 09 2023 03:13 PM
At the risk of becoming something of a 'one trick pony', since Lambda came out, there are now no calculations that cannot be performed by worksheet formulas. In this case I used
= SumByAttributeλ(quantitySold, attributes)
The smaller tables to the right, using the following formulas,
= SumByAttributeλ(quantitySold, product)
= SumByAttributeλ(quantitySold, customer)
are just for the sake of showing that the Lambda function can be reused. The functions I defined are
SumByAttributeλ
= LET(
distinct, UNIQUE(attributes),
quantities, VALUE(TEXTBEFORE(quantitySold, " ")),
summed, BYROW(distinct, SumIfsλ(quantities, attributes)),
HSTACK(distinct, summed & " lbs")
)
SumIfsλ
= LAMBDA(distinct,
LET(
ANDλ, LAMBDA(x, AND(x)),
SUM(FILTER(qty, BYROW(attr = distinct, ANDλ)))
)
)
Of course, because something can be done, it doesn't necessarily follow that it should be done!
Aug 09 2023 03:22 PM
Of course, because something can be done, it doesn't necessarily follow that it should be done!
Amen! [smiley face]
Aug 10 2023 12:30 PM
Glad we could help. Cosmetic addition to @mathetes solution. In general you may work without the formula, instead select entire QUANITY SOLD column, on ribbon Data->Text to Columns, select Space as delimiter on second space and Finish.
It splits column on two, one with numbers
Select first three columns, insert PivotTable without adding data to data model
Drag fields in PivotTable pane and you are ready
The rest is cosmetic.
Uncheck blanks for the labels filter
When you stay on PivotTable on ribbon Design section has 4 drop-down menus in the layout
From left to right select:
Subtotals->Do Not Show Subtotals
Grand Totals ->Off for Rows and Columns
Report layout->Show in Tabular Form
Report layout->Repeat All Item labels
Blank Rows->Insert Blank Line after Each Item
In another section here, PivotTable Styles, select one which you prefer.
Here unselect these two buttons
On ribbon View->Show uncheck Gridlines button.
Finally we have
All together is few minutes job (after you will repeat the same several times). That's quite simple but not optimal solution. All depends on goals. Each solution has pros and cons.