Forum Discussion
A generalised Lambda helper function that return arrays of arrays using bisection.
I must be dense (or my brain may have been scrambled by trying to wrap itself around the concept of thunks for so long) but let ask a simple question. You say:
"it is intended to non-problem specific and hence reusable. ...the user provides a Lambda function that performs the required calculation as if BYROW would return the result.."
So let's say I have range of three one-column rows each containing a first and last name, comma separated ("Elvis,Presley", for example).
Trying
=BYROW(range,LAMBDA(row,TEXTSPLIT(row,",")))
results in your classic #CALC! and the admonition that "nested arrays are not supported".
What do I need to modify in your attached workbook so as to get the desired output (a 2X3 array)?
As you mention, this is not a specific request for help. In an ideal world (short of Excel resolving the array-of-arrays issue natively), I would have a black box into which I would feed the relevant array, the attack approach (by row, by column), the function to apply (TEXTSPLIT(), in this example) and the argument(s) required by that specific function (a delimiter, in this example) - and have the box spit out the correct outcome; something like:
=blackbox(range,byrow,textsplit,",")
One can only hope, right?
- PeterBartholomew1Jan 27, 2024Silver Contributor
I think this does it! I have removed the array of thunks from name manager and passed it as a parameter.
I have added an example that arose in a LinkedIn discussion from Marco Filocamo
as well as your example that called for a similar text split.
The 'major' change for the user is to append the Greek letter λ to BYROW so that the formula picks up my function rather than the in-built helper function.
- djclementsJan 29, 2024Bronze Contributor
PeterBartholomew1 Impressive work! I've been reading your posts/discussions regarding Thunks, but haven't taken the time to dive in and try to understand them yet. Seems to be full of potential...
I was hoping this method might somehow bypass the limits placed on lambda recursion, but alas, it seems to only work well with a few hundred rows of data or less. I had experimented with recursive lambdas a few months ago and came up with a custom STACKBYROW function that achieved similar results; however, it too was only useful with smaller data sets:
STACKBYROW: =LAMBDA(array,function,[initial_value],[start],[pad_with], LET( n, MAX(start, 1), f, function(INDEX(array, n, 0)), v, IF(ISOMITTED(initial_value), f, IFNA(VSTACK(initial_value, f), pad_with)), IF(n<ROWS(array), STACKBYROW(array, function, v, n+1, pad_with), v) ) )
When I finally get some time to take look at Thunks, I'll definitely be using this as a basis for understanding how they work. Thanks!
- PeterBartholomew1Jan 29, 2024Silver Contributor
If you are hitting recursion limits, something has been lost along the way. My first experiments with bisection (I dare say pretty standard within professional IT algorithm development but a struggle for me) predated the release of lambda helper functions and were specifically an attempt to get around the recursion limit on total argument count by reducing the depth of any recursion from N to LOG₂(N).
I resurrected the approach to deal with the REDUCE/VSTACK performance limitations that was itself a workaround for the fact that Excel treats arrays of arrays as an error rather than celebrating it as the correct solution to many, if not most, problems. Whilst bisection required twice as many VSTACK operations, most were small and fast rather than repeatedly stacking large arrays.
- TheDubJan 28, 2024Iron ContributorI still have to digest the "behind the scenes" a bit, but to paraphrase My Fair Lady: "By George! I think you've got it!"
- PeterBartholomew1Jan 29, 2024Silver Contributor
I may well be running round in circles but I have learnt a lot from implementing the binary tree. The feature that a really want to hold on to is that the syntax of the final formula (as well as the Lambda function) should match the in-built BYROW precisely. I do have some concerns about computational efficiency though.
I have written a BYCOL version that works will minimal modifications (basically changing ROWS(
resultϑ) to COLS(resultϑ) and VSTACK to HSTACK. I have confirmed that the existing BinaryTreeλ will also work with thunk arrays generated by SCAN, but the modified helper function SCANλ is harder to write. The modification to the user-provided Lambda function not only has to convert the output to thunks but it also has to reverse the calculation when the first 'acc' parameter is read.
- PeterBartholomew1Jan 27, 2024Silver Contributor
You might well ask! As things stand I believe there is an error in the calculation that means it only works correctly when applied to the original test data. What was intended was that the user would supply a new Lambda function that accepts one row of their data and produces the result they require of it, be it a scalar (which works with the inbuilt helper function BYROW), a row array or, even, a 2D array.
It seems, however that, while I was deriving the algorithm for building the binary tree and 'pruning' branches that would have allowed the indices go beyond the array size and led to #REF! errors , I had introduced a global variable 'resultϑ' to provide the thunk array as test input. Although I now calculate the thunk array within the formula, I suspect that variable is going out of scope during the recursion and the formula is defaulting to the workbook-scoped version and producing a hybrid mess of results.
Back to the drawing board! I will let you know what happens.