Lambda Example: Generate Fibonacci series

Brass Contributor

In this post, I would like to explain how I have used Lambda to create a function to generate a Fibonacci series array.

 

This example can also be used to understand how to create an array where the value of an element in the array depends on previous elements in the same array.

 

Generating Fibonacci as a dynamic array by @VizGenerating Fibonacci as a dynamic array by @Viz

 

This function has been created using three function in two layers. The inner layer functions include the following:

InFib: This function generates the Nth Fibonacci number

InFibSer: This function generates the entire Fibonacci series up to the Nth number

 

The two functions mentioned above require arguments that are complicated and less intuitive. Therefore, we use an outer layer function (FibSeries) that generates the necessary argument and calls the two functions to generate the series.

 

=InFIB

'This function returns nth Fibonacci using tail recursion
'It takes three arguments n, a=0 , b=1 as argument and returns the nth Fibonacci value

 

 

=LAMBDA(n,a,b,IF(n=1,a,IF(n=2,b,Infib(n-1,b,a+b))))
/* The value for the arguments a and b should be passed as 0 and 1, respectively. */

 

 

 

Example:

=InFIB(10,0,1) would generate 34 as the value

 

Our objective, here, is not to obtain a single value but to generate an entire array. However, here is where we have a challenge: While it is easier to manipulate an existing array, it is far more to difficult to create a new array (other than sequence array) or extend an existing array.

 

Therefore, in order to create a Fibonacci series, we are going to pass a dummy array of n values. This array will have 0 as the first element and 1 as the value for all the other elements.

 

The following image portrays the recursive addition process that converts that the initial array into a Fibonacci series.

 

Fibonacci algo illustration.png

 

The following code does carries out the process recursively till the maximum value of the array equals the nth Fibonacci value.

 

=INFIBSER

'This inner function takes three arguments and returns a Fibonacci series
'The three arguments are nth Fibonacci value, the initial array and a sequence array (sequential numbers from 1 to n)
'Note, the sequence array can instead be generated inside the function but it may inefficient

 

 

=Lambda(NthFib, Initval, Sqn,
	Let(
	    Maxval,Max(Initval),
	    If(Maxval = NthFib,
		Initval,
		Let(
		    NewVal,If(Maxval=1,Initval+1*(Sqn>=4),
		              Let(
			          StPos,Match(Maxval,Initval,0)+1,
			          adval,Large(unique(Initval),2),
			          Initval+adval*(Sqn>=StPos)
		                  )
			     ),
		    InFibser(NthFib,NewVal,Sqn)
		)
	)
)
 )

 

 

 

In the first iteration, the above function adds one to all the values from the fourth position onwards. For the subsequent iterations it adds the second largest unique value to all values that all 2 places after the position of such second largest value. The above function uses the match function to identify the position pointer.

 

Although the above function can generate the Fibonacci series, as mentioned earlier, it requires arguments that are less intuitive and which requires some set up.

 

Therefore, here we have an outer layer function that requires only one argument, n. It creates the necessary argument to be passed to the inner layer function and calls them.

 

=FIBSERIES

'This outermost function generates an initial array, a sequence array, and nth fibonnaci value
'It then calls the inner functions with these parameters 

 

 

 

=Lambda(n,
	Let(
		Sqn,sequence(n),
		initval,sign(Sqn-1),
		Nthfib,Infib(n,0,1),
		InFibser(Nthfib, Initval, Sqn)
	)
 )

 

 

 

This final function can be used by the user to generate the Fibonacci series by merely passing the number of values required as argument.

 

Hope you found this useful. I would love to know if you think there is a way to improve the algorithm, here. I am especially looking for suggestions that could replace the match / xmatch in the InFibSer function to get the position pointer. Since the positions move sequentially, I believe a mathematical solution may exist.

 

Thank you

82 Replies

@lori_m How it happened that we started with Fibonacci number and ended up with Fast Fourier Transformation?, :) Very interesting. I am taking a look at it. Complex solution it requires several LAMBDA functions to get there and probably some additional explanation would be needed to fully understand it. 

 

I see you use BASE function, but it has a limit (no larger than 2^53). I am wondering if this limits the size of the Fibo number to get. I see also ASC function (this is the first time I see this formula) what is this function used for in this context? Why is needed? Reading the documentation is related to languages that uses double-byte characters.

 

In terms of performance, not a difference with the BidAdd approach, I ran it several times for FIBO(1221), and the worse case scenario was 110ms, so really great performance considering it has more complexity than the BidAdd approach. 

 

Both solutions reach the maximum Excel computational capacity for 1221 Fibo number. Starting from 1222 both solutions return an empty result.

 

So I would say both solutions have the same performance and the same maximum Fibo number, since BigAdd is a simple formulation I would consider this approach as the best one so far for big fibo numbers.

 

Thanks in advance,

 

David

Interesting - that's not the case for me. On my set up [Current Channel (Preview) 16327.20200] entering =FIB(1222) returns:


1079969233398873236032844293301653524976362936180771647488810146258428573892502087348778468320557178381011519862132501167447528535147702705724587018435771357806213382154720957836431378302535456607039572026816018665428571946697730583021094317239872427815311

 

Exactly the same as entering fib(1222) into https://www.wolframalpha.com/

FIB(10000) took a few seconds to return the required result and also matched exactly. This method is specifically for evaluating large Fibonacci numbers not reachable by the BigAdd method - it won't return all numbers in the series.

 

I've not tested it exhaustively, and there may well be some edge cases so it's good to get the feedback. Can anyone else check the previous attachment and confirm if FIB(1222) is also an issue on their set up?

@lori_m 

I must confess, I wouldn't have thought of a convolution to perform multiplication!  Convolution appears to be even more versatile than I had thought.  The carry step would appear to require a reverse scan (or storing the number from least to most significant digit).  

@lori_mI tested it with Excel for Web (free version), I haven't tried with my work computer that comes with Excel Desktop (I cannot use AFE Add-ins, that is why I prefer to test on the free version online).

 

CORRECTION: The limit is reached with the function I used to calculate the performance (gist from Andrew D Gordon from AFE Add-ins). Now I tested it without using this function, and results now look better:

  • FIB (lor_m using convolution) for 10K Fibo number: 1,150 ms
  • FIBO_STR (using BigAdd) for 10K Fibo number.: 12,510 ms

so FIB solution is about 10x faster!!!, not just that, the duration is not linear, so it is very efficient, for example for FIBO 100K it takes around 2,000 ms. Obviously the number cannot be represented in Excel entirely, but it returns a result. Python is not even able to compute this number. 

 

Great solution @lori_m finally we have an Excel efficient solution for large Fibonacci numbers!

 

Given the limitations we have with Excel precision for large Fibonacci numbers it is worth to consider this approach that uses Excel Javascript API, with the help of Microsoft Garage Project: Script Lab Add-ins.

 

Here a javascript algorithm that uses Matrix Exponentiation by squaring (something we discussed in this post also trying to implement it in Excel):

 

/**
 * @customfunction
 *  {number} n The nth Fibonacci number to be calculated
 * @returns {string} The Fibonacci number
 */
function fib(n) {
  function rec(n) {
    if (n == 0)
      return [0n, 1n];
    else {
      const [a, b] = rec(Math.floor(n / 2));
      const c = a * (b * 2n - a);
      const d = a * a + b * b;
      if (n % 2 == 0)
        return [c, d];
      else
        return [d, c + d];
    }
  }
  if (n < 0)
    throw RangeError("Negative arguments not implemented");
  return rec(n)[0].toString();
}

 

Please keep in mind that @tags in the comment section are required to register properly the custom function. The calculation is carried out with BigInt precision (n suffix in the numbers used) otherwise we would have the same limitation Excel has loosing precision. Working with BigInt numbers, there is no precision limitation. Finally, we convert the result into a string so the output will be as text data type.

 

If the function was defined, for example in the LIB Snippet, then you can invoke it as follows:

 

=SCRIPTLAB.LIB.FIB(10000)

 

 

It returns the correct result in less than 2 secs! I tested also for 100K Fibonacci number with similar execution time, so it a scalable solution.

 

Here is the output for Fibonacci number 100:

davidleal_0-1684014544146.png

 

 

 

@Peter Bartholomew 

I borrowed the non-recursive Convolve function you shared before and made a few other adjustments which improved performance to sub-second for Fib(10k) and several seconds for Fib(100k). Results can also be output in a grid means we are not limited by character length in a cell. The goal I have in the back of my mind is to build a BigInt namespace including operations for arbitrary length addition, multiplication and exponentiation along the lines of the @davidleal JS sample above.  

 

The code for generating Fibonacci pairs follows the reverse SCAN method you suggest as below - digits are grouped into fives for greater efficiency.  The FFT multiplication method is detailed in Multiplication algorithm - Wikipedia (Schönhage–Strassen section)

 

 

 

=LET(
    a, TOCOL(CHOOSEROWS(acc, 1)),
    b, TOCOL(CHOOSEROWS(acc, 2)),
    r, 2 * ROWS(b) - 1,
    aa, Convolveλ(a, a, r),
    bb, Convolveλ(b, b, r),
    ab, Convolveλ(a, b, r),
    fib, MMULT(HSTACK(aa, bb, ab), IF(bin, {0, 1; 1, 2; 2, 2}, {1, 0; 1, 1; 0, 2})),
    carry, MOD(
        SCAN(
            0,
            TRANSPOSE(TAKE(fib, MATCH(2, 1 / CHOOSECOLS(fib, 2)) + 3)),
            LAMBDA(acc, nums, nums + QUOTIENT(acc, 10 ^ 5))
        ),
        10 ^ 5
    ),
    carry
)

 

 

 

 

@lori_m 

Mind blown!  As someone that struggles to remember 6 digits, and given that one row of your number is sufficient to index every atomic particle in the universe, the idea of a number 20899 digits long is beyond my comprehension.  I wouldn't have managed to frame the question, never mind answer it!

 

Something that might save time, is I think it is possible to omit the bit reversal steps.  The transformation will contain the right numbers but not the right sequence.  Once the product and inverse transformation is performed, I believe the issue is sorted.

 

 

@Peter Bartholomew 

I think the far harder part of the calculation is the FFT convolution method - though I would also have struggled to create a recursive carry function like that of @mtarler. Being more familiar with array operations, I found considering numbers as vectors more intuitive. I took the bit reversal method from the SO post. It allows one to quickly calculate matrix powers using a "shift and add" approach:

MPOWER
=LAMBDA(M, n,
    REDUCE(
        MUNIT(ROWS(M)),
        --MID(BASE(n, 2), SEQUENCE(1 + LOG(n, 2)), 1),
        LAMBDA(A, b,
            IF(b, MMULT(M, MMULT(A, A)), MMULT(A, A))
        )
    )
)

For Fibonacci numbers one can set M={1,1;1,0} and return the (2,1) element. The previous method uses convolution in place of MMULT to return larger numbers - as you say, perhaps there are further improvements to be made? One thing I realise is an extra convolution could be saved by writing as Convolveλ(b, 2*a+b, r)*{1,1}-Convolveλ(a, 2*b-a, r)*{1,0} where a and b are equally sized.

@ecovonrein 

Just an update regarding the earlier comment to include a couple more efficient fibonacci methods using SCAN. One way is to process the arrays as thunks,

=MAP(
    SCAN(
        LAMBDA({0, 1}),
        SEQUENCE(1000),
        LAMBDA(thunk, i, LET(fib, MMULT(thunk(), {1, 1; 1, 0}), LAMBDA(fib)))
    ),
    LAMBDA(thunk, INDEX(thunk(), 1))
)

another is to define a list function as a wrapper for the arrays,

=LET(
    list, LAMBDA(array, LAMBDA(i, INDEX(array, i))),
    BYROW(
        1,
        SCAN(
            list({0, 1}),
            SEQUENCE(1000),
            LAMBDA(a, i, LET(fib, MMULT(a(0), {1, 1; 1, 0}), list(fib)))
        )
    )
)

Both are very fast when compared to REDUCE/VSTACK, something that makes a big difference is to define a variable like 'fib' to store the intermediate results as a static array.

@lori_mthis is awesome. I took a bit of time to study this solution as THUNKs in general have been a concept I have brushed with but don't feel I have fully grounded functional grasp of. So I keep a cheat cheat of various useful functions and links and notes and such and wrote this up. I hope a) to get feedback that my personal descriptions of what is happening is correct and maybe b) some help to others that may be struggling with this concept as well.
so concept of a THUNK function is basically using LAMBDA as a pointer to a value, array or function
here are 2 concepts using this for Fibonacci Sequence creation and 'bypassing' array of arrays issue in a faster way (from @lori_m )
| =MAP(SCAN(
| LAMBDA({0, 1}),
| SEQUENCE(1000),
| LAMBDA(thunk, i, LET(fib, MMULT(thunk(), {1, 1; 1, 0}), LAMBDA(fib)))
| ),
| LAMBDA(thunk, INDEX(thunk(), 1)))
So the scan starts with and passes a LAMBDA pointer each loop and each case is a 1x2 array (new val, old val)
Then the MAP is passed all those LAMBDAs/pointers and returns a single value for each (index) and hence a valid output
| =LET(list, LAMBDA(array, LAMBDA(i, INDEX(array, i))),
| BYROW(
| 1,
| SCAN(
| list({0, 1}),
| SEQUENCE(1000),
| LAMBDA(a, i, LET(fib, MMULT(a(0), {1, 1; 1, 0}), list(fib)))
| )))
In this case "list" is defined as LAMBDA( x, LAMBDA( y, ...)) which allows one to call list(a) and return a LAMBDA/pointer
and it is this LAMBDA/pointer that is sent initially and each loop. The BYROW then applies the BYROW across all those LAMBDAs and since that LAMBDA in this case is a function INDEX(..., x) it will return the 1st index of each previously stored array
just as 'proof' of this concept the following is equivalent:
| =LET(list, LAMBDA(array, LAMBDA(i, INDEX(array, i))),
| MAP(
| SCAN(
| LAMBDA(i, INDEX({0,1},i)),
| SEQUENCE(1000),
| LAMBDA(a,i, LET(fib, MMULT(a(0), {1,1;1,0}), list(fib)))
| ),
| LAMBDA(x, x(1))))
so instead of using the built in BYROW helper this version uses the MAP and calls each of the stored LAMBDAs with the argument of 1
and finally as a question.  If I'm not mistaken you are saying that this, which is a different work around to the array of arrays limitation of LAMBDAs, is significantly faster than the DROP(REDUCE(...LAMBDA(.. STACK())) approach.  and can you elaborate on this as I see @Peter Bartholomew making many of these THUNK based solution which I find fascinating and neat in how they create this level of abstractness and even black box solution conceptually but it seems those solutions tended to fair poorer in terms of speeds.

@mtarler 

Your description looks pretty much spot on, I guess you are coming from vba or other imperative programming  background so are converting to loops. In the functional world one only needs functions and values - even arrays can be treated as functions like in the 'list' function. In your MAP equivalent one could pass a second argument of column indices {1,2} to return both columns.  BYROW was used for the single column case only because it removed an extra lambda.

A key point I hadn't fully appreciated before was that wrapping LAMBDA around an expression causes re-evaluation for each function call, this can cause a big slowdown for example within SCAN. If lambda arguments are evaluated using LET in advance, performance can improve dramatically.

@lori_m 

That's a great solution generating set of {next, prev} arrays with MMULT and map it on next only. I like it more than with list one, perhaps since it's more straightforward. 

 


@lori_m wrote:

If lambda arguments are evaluated using LET in advance, performance can improve dramatically.


@lori_m 

That is something I have never managed to achieve!  Since the Thunks are subject to Lazy evaluation, I have always finished by re-evaluating the entire array each time I wish to access an element of the array which is very expensive computationally.  Actually performing the array calculation and then gathering the result as a Lambda array would appear to offer a great route forward.

 

I do not appear to have got savings from my attempt to replicate your approach as yet and seem to require further guidance.  The tests I have tried with the first formula are based upon 4 rows of 1000 random values, which I sort or accumulate row by row.  In each case my attempt to replicate your 1st formula performs much the same as REDUCE/HSTACK rather than accumulating or sorting the 2D array as a single hit.

 

The second formula looks promising applied to the Fibonacci series but I have yet to extract the ideas to apply them to other problems.

@Sergei Baklan 

The second List-based approach makes sense to me coming from a python background where thunks could be thought of as list generators. To get around the arrays of arrays limitation one could convert either to a list of arrays or an array of lists.

@Peter Bartholomew 
I think the fibonacci examples given here and your examples both suggest moving away from REDUCE / VSTACK row-based solutions. Recommended alternatives might be:
- TOCOL for sorting rows
- SCAN (possibly with thunks) / MMULT for long / short accumulations
e.g. pre-multiplying by a lower triangular matrix of ones seems fastest on your example data

Hi Lori

 

I remember MMULT with the upper triangular matrix as being remarkably effective even for quite large problems.  I just chickened out at having to explain the method to a non-mathematically inclined financial modeller as the answer to their corkscrew prayers!

 

The formulae in my spreadsheet are worse than the REDUCE/VSTACK row-based solutions you mention.  I have taken a vertical slice through the data array and accumulated all four rows simultaneously.  That requires 1000 applications of HSTACK.  The advantage of such an approach is that it works even when the accumulated rows are not independent.  In a time series financial model, it is possible that some accumulated arrays may depend upon more than one input row and need to be treated simultaneously.  

 

An approach I finished with there was to write a Lambda function that advances the model one period (essentially providing the derivative with respect to time of the entire model).  It should then be possible to place the Lambda function within SCAN to generate the entire timeseries model as an array of arrays.

 

@lori_m 

I see. Having no python or any other functional programming language background, I better understand array of thunks.

In general that's great that people with other background introduce their experience to poor Excel people. Functional programmers to Excel lambdas and OfficeScript, SQL people to DAX, etc. Even more great if such people have deep Excel background, as you are.

@Peter Bartholomew 

Every time I read your posts I understand that my 200 English words vocabulary is not enough to understand the content properly. 

Back to your performance tests. Not sure they are applied correctly to this Fibonacci case, but I'll try to play more with them.

Hi Peter, I must admit I had replied from my phone to the previous message without examining your file - looking more closely I see your formulas are running horizontally across the table. As Sergei implies, I think you'd need to transpose the Fibonacci setup and augment the MAP parameters to include column indices. I didn't check closely but perhaps a setup similar to below would fair better?

 

 

 

=LET(
    n, ROWS(data),
    m, COLUMNS(data),
    thunks, SCAN(
        LAMBDA(SEQUENCE(n, , 0, 0)),
        SEQUENCE(, m),
        LAMBDA(thunk,i, LET(acc, CHOOSECOLS(data, i) + thunk(), LAMBDA(acc)))
    ),
    MAP(
        IF(SEQUENCE(n), thunks),
        IF(SEQUENCE(, m), SEQUENCE(n)),
        LAMBDA(thunk,i, INDEX(thunk(), i))
    )
)

 

 

 

@Sergei Baklan we all learn off each other - I am just about keeping up with formula developments but not other areas the breadth of knowledge in your posts far exceeds mine (I have very little knowledge of OfficeScript, DAX, M, web, etc.) . If my post is hard to follow that is poor explanation on my part - it is the description of the concept and not any related language construct that is important. In the present situation 'list' works much the same as a thunk, the difference is only that one can include a position index if one chooses rather than leaving the argument empty.

Playing with the concept a bit more, it could be worth defining a generalised SCAN function that applies to both types of setup. For Fibonacci sequence,

=SCANLIST({0,1},SEQUENCE(1000),LAMBDA(acc,i,MMULT(acc,{1,1;1,0})))

or for horizontal accumulations

=SCANLIST(SEQUENCE(ROWS(data),,0,0),SEQUENCE(, COLUMNS(data)),LAMBDA(acc,i,CHOOSECOLS(data, i) + acc))

and the definition can be hidden from view if desired:

SCANLIST=LAMBDA(initial_value, array, function,
    LET(
        list, LAMBDA(vector, LAMBDA([i], INDEX(vector, i))),
        indices, SEQUENCE(
            ROWS(initial_value),
            COLUMNS(initial_value)
        ),
        acclist, SCAN(
            LAMBDA(initial_value),
            array,
            LAMBDA(acc, i, list(function(acc(), i)))
        ),
        MAP(
            IF(indices, acclist),
            IF(array, indices),
            LAMBDA(acc, i, acc(i))
        )
    )
)

 

@lori_m 

I have run a number of tests summing 4 columns by various methods including picking up and dusting down a recursive bisection approach that I wrote before Lambda helper functions and array shaping functions came out.  I have included a Thunk solution with your MAP concept.  I am not absolutely sure I have got the code right but the performance is acceptable.

 

Now it may be possible to do some of these things in Python but that is some way off for me!

image.png