Lambda Example: Generate Fibonacci series

PeterBartholomew1
Silver Contributor
Sep 03, 2023
lori_m
I suspect it is the demands of memory management that kills the REDUCE/VSTACK. I had the idea of splitting the calculation into smaller blocks. I had visualised nesting REDUCE but, when I followed the logic of partitioning the problem to the extreme, I finished with a scalar calculation that does not require REDUCE. That was when I decided to dig back and resurrect the binary tree idea that I had used in the pre-helper-function days.
Ways of performing Accumulation with Dynamic Arrays - Microsoft Community Hub

I think it should be possible to combine the binary tree with a basic calculation using MMULT (maybe 512x512 might be a reasonable size). A combination of a fast calculation method and reducing the bisection levels should be a double payoff.

One thing I still need to do is to take another look at your multiplication algorithm using convolution. The number of digits your Fibonacci calculation reached was amazing.
lori_m
Iron Contributor
Sep 02, 2023
Nice recursive bisection method! There's no chance I'd have been able to figure that out... I will need to spend more time studying this!

It seems bisection is a bit faster for general accumulations and REDUCE/VSTACK is several times slower on a thousand or more rows. Removing the extra LET variable in the thunk method gave a slight performance increase and was equally fast for Fibonacci series when returning a single column (combined with mtarler's BIGADD function).

I also included an optimised MMULT version for comparison using scaled down matrices. The 3 step version was about 50% faster than SCAN on both 4000 and 40000 rows. It should be possible to extend this to n steps, but I don't know if going beyond 3 steps would make much difference - and I couldn't work out the logic to do that yet anyway - but if I do I plan to revisit the FFT algorithm because it follows much the same algorithm.

In general Python will be considerably faster and there are also optimisations for big integers. For example running '%timeit fib(100000)' takes just 5.25ms to return all 20899 digits. In Excel however much of the gains will probably be lost in data transfer so I think lambda solutions are not going away any time soon.
AccumulationBisectionRecursions_LM.xlsx649 KB
PeterBartholomew1
Silver Contributor
Aug 29, 2023
lori_m
I have run a number of tests summing 4 columns by various methods including picking up and dusting down a recursive bisection approach that I wrote before Lambda helper functions and array shaping functions came out. I have included a Thunk solution with your MAP concept. I am not absolutely sure I have got the code right but the performance is acceptable.

Now it may be possible to do some of these things in Python but that is some way off for me!
AccumulationBisectionRecursions Development.xlsx489 KB

Iron Contributor

Aug 20, 2023

Playing with the concept a bit more, it could be worth defining a generalised SCAN function that applies to both types of setup. For Fibonacci sequence,

=SCANLIST({0,1},SEQUENCE(1000),LAMBDA(acc,i,MMULT(acc,{1,1;1,0})))

or for horizontal accumulations

=SCANLIST(SEQUENCE(ROWS(data),,0,0),SEQUENCE(, COLUMNS(data)),LAMBDA(acc,i,CHOOSECOLS(data, i) + acc))

and the definition can be hidden from view if desired:

SCANLIST=LAMBDA(initial_value, array, function,
    LET(
        list, LAMBDA(vector, LAMBDA([i], INDEX(vector, i))),
        indices, SEQUENCE(
            ROWS(initial_value),
            COLUMNS(initial_value)
        ),
        acclist, SCAN(
            LAMBDA(initial_value),
            array,
            LAMBDA(acc, i, list(function(acc(), i)))
        ),
        MAP(
            IF(indices, acclist),
            IF(array, indices),
            LAMBDA(acc, i, acc(i))
        )
    )
)

lori_m
Iron Contributor
Aug 19, 2023
Hi Peter, I must admit I had replied from my phone to the previous message without examining your file - looking more closely I see your formulas are running horizontally across the table. As Sergei implies, I think you'd need to transpose the Fibonacci setup and augment the MAP parameters to include column indices. I didn't check closely but perhaps a setup similar to below would fair better?

=LET( n, ROWS(data), m, COLUMNS(data), thunks, SCAN( LAMBDA(SEQUENCE(n, , 0, 0)), SEQUENCE(, m), LAMBDA(thunk,i, LET(acc, CHOOSECOLS(data, i) + thunk(), LAMBDA(acc))) ), MAP( IF(SEQUENCE(n), thunks), IF(SEQUENCE(, m), SEQUENCE(n)), LAMBDA(thunk,i, INDEX(thunk(), i)) ) )

SergeiBaklan we all learn off each other - I am just about keeping up with formula developments but not other areas the breadth of knowledge in your posts far exceeds mine (I have very little knowledge of OfficeScript, DAX, M, web, etc.) . If my post is hard to follow that is poor explanation on my part - it is the description of the concept and not any related language construct that is important. In the present situation 'list' works much the same as a thunk, the difference is only that one can include a position index if one chooses rather than leaving the argument empty.
SergeiBaklan
Diamond Contributor
Aug 19, 2023
PeterBartholomew1

Every time I read your posts I understand that my 200 English words vocabulary is not enough to understand the content properly.

Back to your performance tests. Not sure they are applied correctly to this Fibonacci case, but I'll try to play more with them.
SergeiBaklan
Diamond Contributor
Aug 19, 2023
lori_m

I see. Having no python or any other functional programming language background, I better understand array of thunks.

In general that's great that people with other background introduce their experience to poor Excel people. Functional programmers to Excel lambdas and OfficeScript, SQL people to DAX, etc. Even more great if such people have deep Excel background, as you are.
PeterBartholomew1
Silver Contributor
Aug 19, 2023
Hi Lori

I remember MMULT with the upper triangular matrix as being remarkably effective even for quite large problems. I just chickened out at having to explain the method to a non-mathematically inclined financial modeller as the answer to their corkscrew prayers!

The formulae in my spreadsheet are worse than the REDUCE/VSTACK row-based solutions you mention. I have taken a vertical slice through the data array and accumulated all four rows simultaneously. That requires 1000 applications of HSTACK. The advantage of such an approach is that it works even when the accumulated rows are not independent. In a time series financial model, it is possible that some accumulated arrays may depend upon more than one input row and need to be treated simultaneously.

An approach I finished with there was to write a Lambda function that advances the model one period (essentially providing the derivative with respect to time of the entire model). It should then be possible to place the Lambda function within SCAN to generate the entire timeseries model as an array of arrays.
lori_m
Iron Contributor
Aug 19, 2023
SergeiBaklan
The second List-based approach makes sense to me coming from a python background where thunks could be thought of as list generators. To get around the arrays of arrays limitation one could convert either to a list of arrays or an array of lists.

PeterBartholomew1
I think the fibonacci examples given here and your examples both suggest moving away from REDUCE / VSTACK row-based solutions. Recommended alternatives might be:
- TOCOL for sorting rows
- SCAN (possibly with thunks) / MMULT for long / short accumulations
e.g. pre-multiplying by a lower triangular matrix of ones seems fastest on your example data
PeterBartholomew1
Silver Contributor
Aug 18, 2023

lori_m wrote:
If lambda arguments are evaluated using LET in advance, performance can improve dramatically.
lori_m
That is something I have never managed to achieve! Since the Thunks are subject to Lazy evaluation, I have always finished by re-evaluating the entire array each time I wish to access an element of the array which is very expensive computationally. Actually performing the array calculation and then gathering the result as a Lambda array would appear to offer a great route forward.

I do not appear to have got savings from my attempt to replicate your approach as yet and seem to require further guidance. The tests I have tried with the first formula are based upon 4 rows of 1000 random values, which I sort or accumulate row by row. In each case my attempt to replicate your 1st formula performs much the same as REDUCE/HSTACK rather than accumulating or sorting the 2D array as a single hit.

The second formula looks promising applied to the Fibonacci series but I have yet to extract the ideas to apply them to other problems.
Lori - array of arrays.xlsx244 KB
SergeiBaklan
Diamond Contributor
Aug 18, 2023
lori_m

That's a great solution generating set of {next, prev} arrays with MMULT and map it on next only. I like it more than with list one, perhaps since it's more straightforward.
lori_m
Iron Contributor
Aug 18, 2023
mtarler
Your description looks pretty much spot on, I guess you are coming from vba or other imperative programming background so are converting to loops. In the functional world one only needs functions and values - even arrays can be treated as functions like in the 'list' function. In your MAP equivalent one could pass a second argument of column indices {1,2} to return both columns. BYROW was used for the single column case only because it removed an extra lambda.

A key point I hadn't fully appreciated before was that wrapping LAMBDA around an expression causes re-evaluation for each function call, this can cause a big slowdown for example within SCAN. If lambda arguments are evaluated using LET in advance, performance can improve dramatically.
mtarler
Silver Contributor
Aug 17, 2023
lori_mthis is awesome. I took a bit of time to study this solution as THUNKs in general have been a concept I have brushed with but don't feel I have fully grounded functional grasp of. So I keep a cheat cheat of various useful functions and links and notes and such and wrote this up. I hope a) to get feedback that my personal descriptions of what is happening is correct and maybe b) some help to others that may be struggling with this concept as well.
so concept of a THUNK function is basically using LAMBDA as a pointer to a value, array or function
here are 2 concepts using this for Fibonacci Sequence creation and 'bypassing' array of arrays issue in a faster way (from lori_m )
| =MAP(SCAN(
| LAMBDA({0, 1}),
| SEQUENCE(1000),
| LAMBDA(thunk, i, LET(fib, MMULT(thunk(), {1, 1; 1, 0}), LAMBDA(fib)))
| ),
| LAMBDA(thunk, INDEX(thunk(), 1)))
So the scan starts with and passes a LAMBDA pointer each loop and each case is a 1x2 array (new val, old val)
Then the MAP is passed all those LAMBDAs/pointers and returns a single value for each (index) and hence a valid output
| =LET(list, LAMBDA(array, LAMBDA(i, INDEX(array, i))),
| BYROW(
| 1,
| SCAN(
| list({0, 1}),
| SEQUENCE(1000),
| LAMBDA(a, i, LET(fib, MMULT(a(0), {1, 1; 1, 0}), list(fib)))
| )))
In this case "list" is defined as LAMBDA( x, LAMBDA( y, ...)) which allows one to call list(a) and return a LAMBDA/pointer
and it is this LAMBDA/pointer that is sent initially and each loop. The BYROW then applies the BYROW across all those LAMBDAs and since that LAMBDA in this case is a function INDEX(..., x) it will return the 1st index of each previously stored array
just as 'proof' of this concept the following is equivalent:
| =LET(list, LAMBDA(array, LAMBDA(i, INDEX(array, i))),
| MAP(
| SCAN(
| LAMBDA(i, INDEX({0,1},i)),
| SEQUENCE(1000),
| LAMBDA(a,i, LET(fib, MMULT(a(0), {1,1;1,0}), list(fib)))
| ),
| LAMBDA(x, x(1))))
so instead of using the built in BYROW helper this version uses the MAP and calls each of the stored LAMBDAs with the argument of 1
and finally as a question. If I'm not mistaken you are saying that this, which is a different work around to the array of arrays limitation of LAMBDAs, is significantly faster than the DROP(REDUCE(...LAMBDA(.. STACK())) approach. and can you elaborate on this as I see PeterBartholomew1 making many of these THUNK based solution which I find fascinating and neat in how they create this level of abstractness and even black box solution conceptually but it seems those solutions tended to fair poorer in terms of speeds.
lori_m
Iron Contributor
Aug 17, 2023
ecovonrein
Just an update regarding the earlier comment to include a couple more efficient fibonacci methods using SCAN. One way is to process the arrays as thunks,
=MAP( SCAN( LAMBDA({0, 1}), SEQUENCE(1000), LAMBDA(thunk, i, LET(fib, MMULT(thunk(), {1, 1; 1, 0}), LAMBDA(fib))) ), LAMBDA(thunk, INDEX(thunk(), 1)) )
another is to define a list function as a wrapper for the arrays,
=LET( list, LAMBDA(array, LAMBDA(i, INDEX(array, i))), BYROW( 1, SCAN( list({0, 1}), SEQUENCE(1000), LAMBDA(a, i, LET(fib, MMULT(a(0), {1, 1; 1, 0}), list(fib))) ) ) )
Both are very fast when compared to REDUCE/VSTACK, something that makes a big difference is to define a variable like 'fib' to store the intermediate results as a static array.
lori_m
Iron Contributor
Apr 23, 2023
mtarler
I'm glad you raised this point as I think there was a misunderstanding in the "corrected" version - probably due to being insufficiently clearly expressed on my part. The REDUCE example was actually designed to show that Fib(1024) could be calculated in just 10 steps by successively squaring the 2x2 matrix. The video David linked to extends this method to calculate arbitrary fibonacci numbers faster by using the binary representation of a number to decompose into powers of 2. Iterating the 2x2 matrix multiplication misses the point somewhat.
mtarler
Silver Contributor
Apr 22, 2023
davidleal so I did not read up on any of this but do have 2 questions:
a) Why are you using a [2x2] * [2x2] instead of [1x2] * [2x2] ? Basically aren't you using the matrix multiplication to hold your prior value so [a b] * [ 1,1; 1,0] => [a+b, a] and you don't need the second row?
b) Why are you then using INDEX( .... ,2, 1) which is essentially the value 1 calculation behind instead of SEQUENCE(n-2) and INDEX(... 1,1) or in my suggestion just INDEX(..., 1)
I don't see any time performance improvements using my crude timer but it just makes sense to me:
FIBO_MAT=LAMBDA(n, LET(a, {1,1;1,0}, IF(n>2, INDEX(REDUCE({1,1},SEQUENCE(n-2), LAMBDA(ac,_,MMULT(ac,a))),1), 1)))
lori_m
Iron Contributor
Apr 22, 2023
PeterBartholomew1
Yes, the wikipedia entry was quite long and I hadn't noticed it before. Regarding the SCAN version above, I recall seeing some examples from you applying string concatenation in a similar way, this one compared favourably against the other REDUCE/VSTACK approach (0.2s vs 0.7s for 1024 elements)

davidleal
If there were an efficient BigMul function available maybe we could make use of that kind of approach in Excel but otherwise i think we are limited by the precision.
davidleal
Iron Contributor
Apr 22, 2023
lori_m just a minor correction to your first formula using matrix exponentiation, in order to make it works:

FIBO_MAT=LAMBDA(n, LET(a, {1,1;1,0}, IF(n=1,1, INDEX(REDUCE(a,SEQUENCE(n-1),LAMBDA(ac,_,MMULT(ac,a))),2,1))))

Rather than the exponential matrix of the accumulator (ac), it is from the initial matrix (a), so I created a name for that. I haven't done any performance testing, but anyway I have seen this matrix multiplication can be simplified, calculating directly each element, so there is no need to do a matrix multiplication (resulting in O(log n)). I would say this would improve the performance. For example the following python code, uses this idea, but I don't know if it has an easy conversion to Excel (Fast Exponentiation)

def fib(n): v1, v2, v3 = 1, 1, 0 # initialise a matrix [[1,1],[1,0]] for rec in bin(n)[3:]: # perform fast exponentiation of the matrix (quickly raise it to the nth power) calc = v2*v2 v1, v2, v3 = v1*v1+calc, (v1+v3)*v2, calc+v3*v3 if rec=='1': v1, v2, v3 = v1+v2, v1, v2 return v2

I guess the logic is based on this mathematical explanation (https://youtu.be/EEb6JP3NXBI) and also https://www.oranlooney.com/post/fibonacci/ another python implementation using fast exponentiation with the mathematical justification.

Regards,

David
PeterBartholomew1
Silver Contributor
Apr 21, 2023
lori_m
I started by looking at
= F(m)*F(n) + F(m+1)*F(n+1)
setting m=1, and applying the basic recursion formula to F(n+1). As usual, once one has finished you know what to search for and can find the answer.
lori_m
Iron Contributor
Apr 21, 2023
PeterBartholomew1
I wasn't aware of that sum of squares identity! After some reflection, I think it can be derived from the 2x2 matrix representation which also provides a fast calc of larger Fibonacci numbers,
=INDEX(REDUCE({1,1;1,0},SEQUENCE(10),LAMBDA(a,_,MMULT(a,a))),2,1)
This matches davidleal's FIBO_SERIE_STR_LM2(1024) last entry to 15sf as well as this alternative:
Fib(n) =TEXTBEFORE( SCAN( "0,1", SEQUENCE(n), LAMBDA(a, _, TEXTAFTER(a, ",") & "," & BigAdd(TEXTBEFORE(a, ","), TEXTAFTER(a, ",")) ) ), "," )
PeterBartholomew1
Silver Contributor
Apr 15, 2023
lori_m
Hi Lori, by the time you got to the third formula there was the start of something familiar! I have no idea how you managed to dream that up as a strategy but I am at least in with a chance of following it!

On a different matter, I asked the the AI search engine in Edge what is significant about the sum of the squares of two adjacent Fibonacci numbers and it came back with a formula and the statement: "if you square two consecutive Fibonacci numbers and add them, you get another Fibonacci number that is twice as far in the sequence". Search engine technology seems to be progressing!
lori_m
Iron Contributor
Apr 15, 2023
PeterBartholomew1
I think the idea of replacing arrays by functions arose after attempting the FFT challenge you posted for general radix. One can't hold real and imaginary 2D arrays simultaneously, but instead one can add a lambda parameter to choose between the two arrays - a bit like adding a third dimension.

In the formula, the function argument of SCAN is given by:
LAMBDA(f,_, LAMBDA(i,i*f(0)+f(1)))
The idea here is to rewrite the pair (x, y), say, as a function: f(i) i=0,1. i.e.
f(i) <- i*f(0)+f(1), i = 0,1
or more explicitly
f(0),f(1) <- f(1),f(0)+f(1)

Similarly if we set as initial value:
f = LAMBDA(i,i), i = 0,1
then
f(0),f(1) <- 0,1
The initial value of i needs to be input which was a problem for an array of lambdas so I ended up passing to MAP instead,. No doubt there's numerous improvements to be made - I'd be interested in other ideas to so this kind of thing.
lori_m
Iron Contributor
Apr 15, 2023
davidleal In response to the points you raise...

Underscore _:
This is used to specify an unused variable in Python, I don't know if this convention is more widespread. See Fibonacci code further down this link
https://realpython.com/python-itertools/#recurrence-relations
underscore is used both within accumulate (equivalent to SCAN) and within the for loop

Case n=1:
If one decides to follow a strict convention then yes it makes sense to add n=1 case. On the other hand if Fibonacci is considered more as a playground of ideas (as per Sergei's comment) such details are of less significance and inserting extra conditions may distract from the essence of a chosen approach. In any case since the sequence requires a pair of initial values, either (0,1) or (1,1). any of 0, 1 or #error may be considered possible conventions for n=1.
PeterBartholomew1
Silver Contributor
Apr 15, 2023
SergeiBaklan
Maybe you understand lori_m 's formula but I am still struggling! I even tried pulling the formula apart using local names but, though the formula still works, there has not been a huge leap forward in terms of its transparency.
= LET( identityλ, LAMBDA(i,i), k, SEQUENCE(10), fibArrλ, SCAN(identityλ, k, LAMBDA(fnλ,_, LAMBDA(i, i*fnλ(0)+fnλ(1) )) ), MAP(fibArrλ, LAMBDA(fibλ,fibλ(0)) ) )
davidleal
Iron Contributor
Apr 15, 2023
lori_m this is an interesting idea, I was trying to test it, we need to treat the special case of n=1, because it provides the wrong result {1;1}, but it is an easy fix:

Fib=LAMBDA(n,IF(n=1, 1,REDUCE(1,SEQUENCE(n-1), LAMBDA(b,_,VSTACK(b,SUM(TAKE(b,-2)))))));

I don't fully understand the underscore notation (_) I google it, but I was not able to find any reference. Would you elaborate it more or share some link? thanks. My intuition, from other languages, it is a dummy variable, or unnamed variable, because it is not really used for other purpose than to define the number of iterations, but their values are not really used.

I modified it to use BIGADD. I incorporated mtarler's fix:

BigAdd = LAMBDA(A, B, [c], let(cc, if(isomitted(c),0,c), if((len(a)>14)+(len(b)>14), let(lowadd, right(a,14)+right(b,14)+cc, ccc, if(len(lowadd)>14,1,0), aaa, if(len(a)>14,left(a,len(a)-14),0), bbb, if(len(b)>14,left(b,len(b)-14),0), bigadd(aaa,bbb,ccc) & text(right(lowadd,14),"00000000000000")), (a+b+cc)&"")));

Here is a possible solution for large Fibonacci numbers, I think we need to separate the cases n=1, n=2 from the iteration process to make it works and then to generate the sequence from n-2:

FIBO_SERIE_STR_LM2=LAMBDA(n,IF(n<3, SEQUENCE(n)^0&"", REDUCE({"1";"1"},SEQUENCE(n-2), LAMBDA(b,_,VSTACK(b,BIGADd(@TAKE(b,-1), @CHOOSEROWS(b,-2)))))));

It works as expected. In terms of performance about the same as your previous formula using recursion. Which is expected since this approach is similar to mine and similar to mtarler, but your is more concise and invokes less functions, I have an extra HSTACK and INDEX for example, so I would expect that for larger dataset your approach will show some performance improvements.

David

Forum Discussion

Lambda Example: Generate Fibonacci series

Resources