Forum Discussion
Recursive LAMBDA implementation of Excel's REDUCE function.
I hope we do get array versions of lambda helper functions at some point - many people have been asking and these type of workarounds are not exactly intuitive. In some tests, I found the SCAN2 function does slow down but wasn't too bad - for example it took about a second for a 1000x1000 matrix,
=SCAN2(1,SEQUENCE(,1000,2),LAMBDA(a,b,VSTACK(a,b)))
For comparison, the Python equivalent works fine with nested arrays:
from itertools import accumulate
pd.DataFrame(accumulate(range(2,1001), lambda a,i:a+[i],initial=[1])).T
In a Jupyter notebook this runs very fast (70ms) but within Excel there was a long delay retrieving data (around 20s). I have mixed feelings about python in Excel - and will mainly be sticking with notebooks for now. It's great to have the extra functionality at one's fingertips but I'd still choose a lambda solution if available being seamlessly bound with the spreadsheet structure: range references, grid calc, security, etc.
Sorry, I was wrong. Added removing of #N/A to the formula and that eats all the time. Without it about the second for SCAN2.
Python in my case shows
CPU times: user 160 ms, sys: 1.81 ms, total: 162 ms Wall time: 209 ms
plus about 10 sec for the delivery.
Practically the same.
By the way, in one of my first exercises I unintentionally generated with numpy nested array an recognized that only with familiar #CALC! error returning it to the grid.
Agree, Python won't substitute lambdas. What I tried to say is that Excel team right now and in nearest future is more concentrated on Python (my impression), it'll be not enough money and time to fix nested arrays issue.
- lori_mOct 14, 2023Steel Contributor
I don't think we need full support for array of arrays, just maybe additional parameters to SCAN / BYROW / BYCOL to indicate array support. If set, the function would need to use hstack({a},{b}) in place of {{a},{b}} like in workaround formulas. The way these functions are implemented currently may require a fixed output array size allowing for efficient array processing (e.g. modifying in-place) so there could be a valid reason to add an optional parameter here.
By the way, python accumulate function uses the same convention as above for first column - I just needed to add a comma to second comment for LBROWN7 formula to parse. This worked ok on hundred records but hung for me on a thousand. Also I think IFNA(...,"") could be sufficient to deal with NA results - no need for MAP - or equivalently .fillna('') in python.
- SergeiBaklanOct 14, 2023MVP
If function by function - maybe. If Microsoft goes this way most probably new functions like SCAN.EXT will be introduced with usual deployment cycle from Beta to production.
Performance is the key here. So far only PeterBartholomew1 approach gives more or less acceptable results on large arrays. As for IFNA() my impression it reduces performance dramatically. MAP() even more.
- lori_mOct 16, 2023Steel Contributor
I had in mind wrapping the result with IFNA(SCAN2(...),"") rather than inserting extra conditions into the recursive definition which would likely have a considerable impact to performance. Using the following recursive let implementation also gave the same results as the LBROWN7 second formula
=LAMBDA(initial_value, array, function, LET( ret, LAMBDA(fn, initial_value, array, IF( COLUMNS(array) = 1, function(initial_value, array), LET( acc, fn(fn, initial_value, DROP(array, , -COLUMNS(array) / 2)), HSTACK( acc, fn(fn, TAKE(acc, , -COLUMNS(initial_value)), TAKE(array, , -COLUMNS(array) / 2)) ) ) ) ), IFNA(ret(ret, initial_value, array), "") ) )({1, 2}, {2, 3, 4}, LAMBDA(a, b, VSTACK(a, b)))
P.S. I found this useful for the python code: Anonymous recursion - Wikipedia