Forum Discussion
How to build a destacked table
- Jun 29, 2022
Another option
let Source = Excel.CurrentWorkbook(){[Name="Input"]}[Content], UnpivotedOtherColumns = Table.UnpivotOtherColumns(Source, {"Item"}, "Month", "Value"), MaxByItem = Table.Group(UnpivotedOtherColumns, {"Item"}, { {"Data", each _, type table}, {"ItemMax", each List.Max([Value]), type number} } ), ListOneToMax = Table.AddColumn(MaxByItem, "OneToMax", each {1..[ItemMax]}, type list), ExpandedData = Table.ExpandTableColumn(ListOneToMax, "Data", {"Month", "Value"}), ListOneZero = Table.AddColumn(ExpandedData, "OneZero", (x)=> List.Transform(x[OneToMax], (y)=> if y <= x[Value] then 1 else 0 ), type list ), SelectedColumns = Table.SelectColumns(ListOneZero,{"Item", "Month", "OneZero"}), PivotedItemTable = Table.Group(SelectedColumns, {"Item"}, {"Data", each Table.FromColumns([OneZero], [Month]), type table} ), ExpandedItemTables = Table.ExpandTableColumn(PivotedItemTable, "Data", List.Skip(Table.ColumnNames(Source)) ) in ExpandedItemTables
Hi Canerik92
I put together the following 3rd option:
let
// Function BEGIN
PivotOneZero = (values as list, names as list) as table =>
let
MaxFromValues = List.Max(values),
RepeatOneZero = List.Transform(values, each
if _ < MaxFromValues
then List.Repeat({1}, _) & List.Repeat({0}, MaxFromValues-_)
else List.Repeat({1}, MaxFromValues)
)
in
Table.FromColumns(RepeatOneZero, names),
// Function END
Source = Excel.CurrentWorkbook(){[Name="Input"]}[Content],
Months = List.Skip(Table.ColumnNames(Source)),
ValuesList = Table.AddColumn(Source, "ValuesList", each
Record.ToList(Record.SelectFields(_,Months)), type list
),
SelectedColumns = Table.SelectColumns(ValuesList, {"Item", "ValuesList"}),
TableOneZero = Table.AddColumn(SelectedColumns, "OneZero", each
PivotOneZero([ValuesList], Months), type table
),
RemovedValuesList = Table.SelectColumns(TableOneZero,{"Item", "OneZero"}),
ExpandedOneZero = Table.ExpandTableColumn(RemovedValuesList, "OneZero", Months)
in
ExpandedOneZero
Then expanded the Inputs table to 500 rows with random values between 3 and 100. This generates an Output table of 46500+ rows. Finaly refreshed the 3 queries 3 times
==> Option1 is - with this scenario - the slowest one, the 2 others compare
Hope this helps
For me, the calculation with 500 rows and randomly-chosen values in the range 3-100 (50,000 output rows) took about 200ms.
Being based on a volatile array function, the price was paid at every worksheet change.
- Patrick2788Jul 02, 2022Silver Contributor
I revisited this task because I had gotten off track and you mentioned using the new re-shaping functions would be key to a solution.
(This is going off the sample data offered by Canerik92 )
The way I put this together, I see three pieces: the header, the 'row labels', and the matrix of 1s and 0s.
For the row labels (StackMax=6 for this example). I chose each of the 3 rows 6 times with some sequencing:=CHOOSEROWS(item,INT(SEQUENCE(ROWS(item)*StackMax,1,1,1/StackMax)))
For the header: A simple named item referring to the top row.
For the matrix, I first reduced it to a scalar and used EXPAND to help pad with 0s.
=REDUCE("",TOCOL(rng,,1),LAMBDA(a,v,TEXTJOIN(",",1,a,EXPAND(SEQUENCE(,v,1,0),1,StackMax,0))))
Then split it and re-shaped it:
=WRAPCOLS(TEXTSPLIT(String,,","),StackMax*ROWS(item))
A bit of stacking to complete it:
=VSTACK(months,HSTACK(ItemList,ReshapedArray))
- LorenzoJul 01, 2022Silver Contributor
What tools do you use/recommend to measure formulas exec. time? Something more sophisticated than the VBA suggested on https://docs.microsoft.com/en-us/office/vba/excel/concepts/excel-performance/excel-improving-calculation-performance? Thanks
- LorenzoJun 30, 2022Silver Contributor
Thanks for sharing your LAMDBA & run time. TBH I don't measure as precisely as you do (just watched PQ Refresh window in this case)