Forum Discussion
Unpivot Monthly Data with a Formula
My current feeling is that both thunks and MAKEARRAY need to be approached with a certain degree of caution. In the present instance, I decided to use concatenated strings to hold information relating to an item within the crosstab. The code to the point of getting a list of packed records is the same whether I go on to use HSTACK or MAKEARRAY
= LET(
packed, nBravo&","&dates&"|"&rBravo,
mask, LEFT(dates,1)<>"Q",
filtered, FILTER(packed,mask),
list, TOCOL(filtered),
Unpackλ(list)
)Using HSTACK
UnpackHSλ = LAMBDA(list,
LET(
dimensions, TEXTBEFORE(list, "|"),
d₁, TEXTBEFORE(dimensions, ","),
d₂, TEXTAFTER(dimensions, ","),
val, VALUE(TEXTAFTER(list, "|")),
HSTACK(d₁, d₂, val)
)
);and using MAKEARRAY
UnpackMAλ = LAMBDA(list,
MAKEARRAY(ROWS(list),3,
LAMBDA(r, f,
LET(
item, INDEX(list, r),
SWITCH(f,
1,TEXTBEFORE(TEXTBEFORE(item, "|"), ","),
2,TEXTAFTER(TEXTBEFORE(item, "|"), ","),
3,VALUE(TEXTAFTER(item, "|"))
)
)
)
)
);(I do not do 'concise'!)
A key point of interest is that the solution with HSTACK took 10ms to refresh whereas MAKEARRAY took 690ms. That is a massive difference.
- Patrick2788May 23, 2022Silver Contributor
Thank you for sharing. The more examples I see with MAKEARRAY, it seems to confirm the function is not be used for re-shaping data. Its best use is in generating random data (I'd argue there are other simpler options available to doing the same task).
Re: calculation times - I have a working knowledge of VBA but would much rather do a task at the sheet level if possible (Especially with the wealth of new functions being released). The way I was told to write code was to minimize 'touching the sheet' (e.g. write data to cells rather than copy/paste, avoid .select, etc.). I'd like to understand where MAKEARRAY's calculation time is coming from. My guess is the use of INDEX and having to 'touch the sheet' more than the HSTACK solution.
- mtarlerMay 23, 2022Silver ContributorI agree that INDEX should take many more cycles than HSTACK. Basically you are evaluating/executing on every element while HSTACK presumably acts on the entire array(s). I presume you could 'test' this somewhat by comparing a 100,000x2 grid compared to a 2x100,000 grid since the former should show significant improvement using HSTACK but MAY (depending on the coding) show much less improvement on the latter.
- SergeiBaklanMay 23, 2022Diamond Contributor
PeterBartholomew1 , why I'm not surprised...
- PeterBartholomew1May 23, 2022Silver Contributor
It gets worse than that!
If one simple focusses on the values (not the agent names or dates),
= TOCOL(CHOOSECOLS(rBravo,f))the formula takes 0.7 ms. On the other hand, if I use MAKEARRAY to calculate a thunk containing the agent, date and value triples for each value, then filter and unpivot the thunks, and finally use a second MAKEARRAY to extract the triple from each thunk
= LET( packedϑ, MAKEARRAY(200,16, LAMBDA(r,f, LAMBDA(CHOOSE({1,2,3}, INDEX(nBravo,r), INDEX(dates,f), INDEX(rBravo,r,f) ) ) )), mask, LEFT(dates,1)<>"Q", filteredϑ, FILTER(packedϑ,mask), listϑ, TOCOL(filteredϑ), MAKEARRAY(2400,3, LAMBDA(r₀,c,INDEX(INDEX(listϑ,r₀,1)(),c)) ) )takes 10,032 ms!
DO NOT USE THIS!
I would rather like to hear an authoritative account of where the CPU cycles went!
- SergeiBaklanMay 23, 2022Diamond Contributor
Charles could explain all details, but look, you call few thousand times values to memory, use slow FILTER and another few thousand times make calls on per-value basis. It shall be slower than call few times large blocks.