Forum Discussion
HRECURSE instead of MAKEARRAY, recursing LAMBDA
I wish to share with the community some seriously goofy but generic LAMBDA code I created yesterday.
By way of introduction, once you go down the route of SPILLed inputs, you very quickly arrive at the need for downstream calculations that flex with those inputs. One extremely common challenge in such scenarios is that the downstream calculation is a series that evolves - column by column - based on the SPILLed inputs. That is, column(x) requires access to the results of column(x-1).
Do a little research and we find that plenty of readers have run up against this challenge. The commonly deployed answer is that of recursion. Not great, since the problem is really just iterative, but needs must. All good.
Having used LAMBDAs for a few months now, I have become a little wary of naming them - Microsoft's original intention. I am very happy to name generic functions - and I will in a moment introduce HRECURSE to you - but I am very unhappy to hide from immediate sight code that is extremely contextual. I prefer to splatter such LAMBDAs in the cell, MAP or MAKEARRAY style, so that the reader stands a fighting chance to follow my coding.
But there is a limitation of LET that frustrates this endeavour. While it is perfectly acceptable to write
=LET(fn;LAMBDA(x;x);fn(1))
we CANNOT recurse by writing
=LET(fn;LAMBDA(x;fn(x));fn(1))
(I use the European notation throughout this post. I trust international readers will adapt.)
This kind of thing only works via the Name Manager. It is thus that all discussions I have read about the said challenge go down the route of defining names - often more than one (an initializing wrapper and a recursing core). From my perspective, these solutions have two drawbacks:
1 They force contextual code out of sight
2 They force the implementation, and re-implementation, of a recursion logic which tends to blow the mind of the average user and can actually be fairly dangerous (if got wrong).
So, yesterday I set out to create a generic wrap that would tackle both problems and I propose to you my first HRECURSE:
=LAMBDA(i;n;Prev;fn;LET(This;fn(i;Prev);IF(i<n;HSTACK(This;HRECURSE(i+1;n;This;fn));This)))
The H stands for horizontal - the function will SPILL sideways. If you need vertical, you know what to do. The trick used here is that LAMBDA allows the passing of functions by parameter! (The same feature may be used by MAP and MAKEARRAY.)
You can easily see the signature expected of the function "fn" passed - it must accept two inputs:
p1 - Some integer that is functionally the depth of recursion but logically the column the code operates in. It gives spreadsheet context to the code (which you may not need).
p2 - The result of the previous step.
So an easy application of HRECURSE writes:
=HRECURSE(1;5;0;LAMBDA(i;p;p+1))
and gets us
1 | 2 | 3 | 4 | 5
This is already a major win for people wanting to write Fibonacci series, say. But my challenge was much tougher. I had a strip of SPILLed cashflows and their IRR. I wanted to show how that stream of cashflows disaggregates into Capital and Interest. The most concise algorithm I could think of requires FOUR rows.
Not to worry - no-one said that p2 had to be a single number, right? So, off we go:
=HRECURSE(1;5;{1;2};LAMBDA(i;pVec;LET(pVec1;INDEX(pVec;1;1);pVec2;INDEX(pVec;2;1);VSTACK(pVec1+1;pVev2*2))))
Don't try it. It does not work. But you get the idea. Reason it fails is that INDEX isn't actually a function but an instruction to Excel's precompiler which can only resolve on spreadsheet ranges. I was stuck. Although I had my result vector from the previous step, I could not extract its components.
Never mind, there are ways to skin this cat - here is one way to extract the i-th row from an array a:
=LAMBDA(a;i;FILTER(a;SEQUENCE(ROWS(a))=i))(D16:D20;3)
(I am still debating with myself whether this FILTER is more efficient than SUM(a*(SEQUENCE(ROWS(a))=i)) - dunno. Either will do. FILTER has the advantage of allowing HRECURSE to also pass strings.)
So we can now extend the previous example like so:
=HRECURSE(1;5;{1;2};LAMBDA(i;pVec;LET(p;LAMBDA(i;FILTER(pVec;SEQUENCE(ROWS(pVec))=i));VSTACK(p(1)+1;p(2)*2))))
to get:
2 | 3 | 4 | 5 | 6
4 | 8 | 16 | 32 | 64
And I was quite pleased with this. Just until a few minutes ago when it occurred to me that I can take the sting out of this fudge by moving the proxy INDEX into HRECURSE in this final version:
=LAMBDA(i;n;Prev;fn;LET(Mask;SEQUENCE(ROWS(Prev));This;fn(i;LAMBDA(r;FILTER(Prev;Mask=r)));IF(i<n;HSTACK(This;HRECURSE(i+1;n;This;fn));This)))
You can see that instead of passing Prev (as in the original proposal), I am now passing a LAMBDA function that is created at the level of each recursion WITHIN THE CONTEXT of that recursion. This function only takes one input - the row within the previous result vector - and we are now able to rephrase the proposition above as:
=HRECURSE(1;5;{1;2};LAMBDA(i;p;VSTACK(p(1)+1;p(2)*2)))
I propose that this notation is generic enough for anyone to SPILL contextual series calculation without requiring a profound grasp of recursion.
PS: Regrettably, Excel will not allow me to make the row input into p() optional. (Or I have not figured out how to do it.) So the initial example we must now rewrite (a little longer) in the final version:
=HRECURSE(1;5;0;LAMBDA(i;p;p(1)+1))
PPS: To circle back to the introduction, the 5 in the example above would become COLUMNS(someInputs#) in a real-world application and you would use i to INDEX those inputs within your LAMBDA, in the same way one would when using MAKEARRAY.
11 Replies
- phrostixCopper Contributor
Apologies for resurrecting an old thread, but as it remains one of the quite scarce sources of valuable information on recursion, I feel it is appropriate to clarify INDEX's context for the sake of future readers who might not be familiar with it.
"..INDEX isn't actually a function but an instruction to Excel's precompiler which can only resolve on spreadsheet ranges."
This statement is only true when using INDEX in its reference form, which requires syntax that is distinct from the better-known array form:
INDEX( (array_1,array_2,...array_n) , row_Index, column_Index, [area_Index])
Where the area index selects which array that INDEX will return the intersection of the row and column indices from. This form of INDEX is unable to accept any parameter where:
1. ISREF(array/range) ==> FALSE
2. 'Some Other Sheet!'<Reference>
Note that the label is somewhat misleading, and should not be confused with the colon form of INDEX which will always return a reference or an error; while the reference form cannot accept virtual arrays, it can create them such as in:
=INDEX(($A$2:$F$11,$I$4:$L$13),{3;5;8},SEQUENCE(,4),{1,1,1,1;2,2,2,2;2,2,2,2})
This example showcases an interesting feature of INDEX in its Reference form; this returns the first four columns of the third row in $A$2:$F$11, and the first four columns of the fifth and eighth row in $I4:$L14, which are two obviously non-contiguous ranges.
An example of what the array form can do that the reference form cannot:
INDEX(sequence(30,5),{1;8;2;5;3;8;8;8},{1,2,3,4,5,1,3,5,2,4})
Which returns the array :
With the reference syntax:
=INDEX((SEQUENCE(30,5),SEQUENCE(30,5)),{1;8;2;5;3;8;8;8},{1,2,3,4,5,1,3,5,2,4},1)
Excel will not let you enter this formula because SEQUENCE is not a reference. For reasons unknown to me, this syntax is still valid:
=INDEX((SEQUENCE(30,5)),{1;8;2;5;3;8;8;8},{1,2,3,4,5,1,3,5,2,4},1).
As we can see from this example, INDEX is clearly not restricted to range only operations. - ecovonreinIron Contributor
I just realized that my LAMBDA(a;i;FILTER(a;SEQUENCE(ROWS(a))=i)) reinvents CHOOSEROWS(a;i). Using that new Excel function, the final HRECURSE reads
=LAMBDA(i;n;Prev;fn;LET(This;fn(i;LAMBDA(r;CHOOSEROWS(Prev;r)));IF(i<n;HSTACK(This;HRECURSE(i+1;n;This;fn));This)))
- Christopher GrahamCopper Contributor
ecovonrein Great code! I've used it in VRECURSE form to stack results of an iterative function that wouldn't spill otherwise.
Is there a way to update it to refer back to a specific row/column index of the previous step? I need more than just single row/column operations. I'd like to be able to specify INDEX(p, r, c) in the calculation.
- ecovonreinIron Contributor
Thanks! However, since I wrote this great piece of code, I learned that there is one function in the Excel toolkit that solves the problem without the need for recursion. Now, brace yourself. The great jesters in Seattle have dubbed this function, which allows us to augment things, REDUCE.
So, if you write:
=REDUCE(0;SEQUENCE(10);LAMBDA(prev;this;VSTACK(prev;HSTACK(this;INDEX(prev;this-1;1)))))
then you get what you want. You are successively augmenting "prev" by forever VSTACKing it (I generate 2 columns with this formula), and you have access to all of "prev", which is what you asked for. That is, you don't need to use V/HRECURSE 😉PS: Credit for the discovery that REDUCE can be abused in this way goes to lori_m.
- ecovonreinIron Contributor
Word of warning, as I am just rolling this technique into production. I notice some VERY ODD behaviour as I copy something akin to
=VRECURSE(1;5;0;LAMBDA(i;p;p(1)+INDEX(P$100:P$200;i;1)))
across columns of my spreadsheet. The formula will calc (and give the correct result) during normal operation but when I try to edit it (via F2 in column P, and ONLY column P), I cannot ENTER. Excel throws up some "bad formula" error and places the cursor on P$100. It is very odd - some logic of the editor appears to confuse P$100 with the function "p". I can resolve the (editing) issue with
=VRECURSE(1;5;0;LAMBDA(i;Prev;Prev(1)+INDEX(P$100:P$200;i;1)))