Forum Discussion

VijayVardhan's avatar
VijayVardhan
Copper Contributor
Dec 03, 2023

THUNKS

http://www.myonlinetraininghub.com/excel-formulas-to-summarise-monthly-data-into-quarters

Above is the link to reference article about which I have some questions that I am posting below.

 

SumQtrsSergei.xlsx

THUNKS

1] For multi-row and multi-column input to INDEX function, giving row no and column no as input will output single cell value. For input of column no as zero, entire row will be given as output. For single column and multi-row input, giving column no as either 0 or 1 or not giving any column no does not matters. It will give output value corresponding to the row no. Despite sumsArray being a single column and multi-row array of error cells, giving 0 or not giving column no to INDEX function throws #VALUE! error as output. Column no input must be 1. Why is that?

 

2] INDEX (sumsArray, 4, 1)(). INDEX function extracts cell values only after placing empty bracket in the syntax not otherwise. This is very bizarre. What’s the logic behind this?

 

3] sumsArray:

a) Single LAMBDA throws error in single cell however it is still holding all the values with that single cell of error.

b) Double LAMBDA throws error in multi-row single column. No of rows = No of rows of data. Each error cell is now holding data/values of i] Corresponding to that row no and column no = 1 and ii] Entire row. This double lambda concept is new to me and I would like to know about its functioning, logic, purpose and application.

c) Why second or inner LAMBDA has no declaration of variable v inside it?

d) sumsArray results into a column of error cells. How can a formula extract a non-error output of error cells or using error cells as input?

 

4] sumsB: Declaration of variable v inside second or inner LAMBDA of sumsArray throws error output. Why is that despite declaration of variable is in the standard syntax of LAMBDA?

-----------------------------------------------------------------------------------------------

Double LAMBDA + No () for INDEX: Gives column of #CALC! error but each of these error cells either 1) Do not pack the cell values of the row that corresponds to the respective error cell of the column or 2) They do pack or hold them but does not reveal them that is output them after giving input of row and column no of a cell in the array to INDEX function.

 

Double LAMBDA + () in the INDEX: It reveals/outputs cell value of the cell in the grid/array after giving input of row and column no of desired cell to INDEX function but there is a prerequisite that you must first input (n, 1) [where n = row no] to INDEX function so that column no becomes 1 which represents column no of error cell (CEC) output of Double LAMBDA formula. So you must give column no = 1 first and then row no as n to select corresponding error cell of CEC which packs/holds all cells with their values in that row. Then next after this one row output, you input (1, m) to second INDEX function to select that row with row no = 1 that is the row itself (Error cell with row no n from CEC) and m = column no of cell to select cell in the row whose value you want by giving its column no m.

-----------------------------------------------------------------------------------------------

MAKEARRAY1

If MMULT (data, --(TRANSPOSE(MonthsInQuarters)=quarters)) can deliver quarterly sum for each row then why you need MAKEARRAY, LAMBDA, INDEX, BYROW and all that fuss to get this?

-----------------------------------------------------------------------------------------------

MAKEARRAY2

If we include a cell address of cell containing text in sum function along with cell address of other cells containing numbers then it delivers the sum of all numbers without giving any error by ignoring cells which contain text. Cell address of cell containing text can be either specified individually or be included in a range of cells containing numbers. If on the other hand if we add a text in the function like sum(“TEXT”, A1:A5) where A1:A5 contain numbers then the function throws error. Why it doesn’t throw error if we input text value through cell address instead of directly as text string in double quotation?

=======================================================

SumQtrsPeter.xlsx

EXPAND function turns blank into 0 when delivering output directly in the cells on sheet whereas keeping blank as blank when delivering output to another function. Why is this difference?

=======================================================

It will be very nice if someone can answer atleast some of these questions.. I have same two files that are available in the tutorial but containing each step that I have unfolded to understand how each of these formulas work. I have attached the files. 

SergeiBaklan

@Peter Bartholomew

    • VijayVardhan's avatar
      VijayVardhan
      Copper Contributor
      I have attached the files and I have added each step in the sheets that unfold the formulas one by one to get the hang of it.
      • peiyezhu's avatar
        peiyezhu
        Bronze Contributor

        VijayVardhan 

         

        Thanks for your response.

        I am afraid I can not provide more information because LAMBDA only available on o365 PC version but I have no PC on hand.

         

        I guess if you transpose the raw data layout from multiple columns to multiple raws as :

         

        date item quantity

        .

         

        You can sum them by pivottable or sql group by quarter.

         

         

    • PeterBartholomew1's avatar
      PeterBartholomew1
      Silver Contributor

      peiyezhu 

      I believe the url I provided is the correct version of the one intended by the OP.  The files can then be downloaded from links within Mynda 's blog.

  • VijayVardhan 

    THUNKS

    1]  To an extent the answer is 'because that is the way it was implemented'.  In normal use, INDEX is applied to a range and the result is also a range object, be it a row, column or cell.  Once the function is applied to an array, the rules change.  To return a multi-element array, the indices need to be explicit, for example

    = INDEX(array, {3;2;1}, 5)

    Now Lambda functions are first class objects that may be included as parameters or returned as the result of a formula, the rules seem to have changed again and both row and column parameters are required even if one has to be 1.

     

    2] Moving to the next issue,

    = INDEX(sumsArray, 4, 1)

    is valid, but it returns a Lambda function (specifically a thunk) and not an array of values.  Lambda functions are only evaluated when the necessary parameters are provided.  In the case of the Thunk, no parameters are required but the evaluation is then triggered by the null parameter string:

    = INDEX(sumsArray, 4, 1)()

    You could even split the calculation in to two parts

    = LET(
        arrϑ, INDEX(sumsArray, 4, 1),
        arrϑ()
      )

    if you so chose.

     

    The main limitation that is being addressed by the use of thunks is that (as currently implemented) Lambda helper functions are not capable of returning arrays of arrays without contorted workarounds.  Given that I would define the modern spreadsheet as an environment for manipulating multi-dimensional arrays as 2D arrays of arrays, this is an appalling limitation.  Most of the formulas I write require the array of arrays, quite often at multiple points within a given formula.

     

    Note: Traditional spreadsheets approach the problem differently by working from the cell upwards.  Arrays are displayed by writing 10s of 1000s of individual scalar formulas that reproduce the appearance of the result array without any explicit recognition given to the structure of the data being manipulated.

     

    • VijayVardhan's avatar
      VijayVardhan
      Copper Contributor

      Do you mean that if Lambda function is included in the array then one of the parameter that is either the row or the column argument of the INDEX function must be 1 to produce the output without throwing any error?

      Is this the double Lambda setup is called as thunk or is it the double Lambda's wrapped inside BYROW function is called as thunk? Can you explain more about this thunk concept as without understanding of which it's difficult to understand the answers of the questions and to get the hang of the formula.

      How is that the formula INDEX(sumsArray, 4, 1) returns a Lambda function and not an array of values consequently requiring a null parameter string? I see that sumsArray = Double Lambda's wrapped inside BYROW function and BYROW function fulfills the parameter passing requirement of Lambda function by feeding 'data' as parameter to the double Lambda's.

      • PeterBartholomew1's avatar
        PeterBartholomew1
        Silver Contributor

        VijayVardhan

        I regard a Thunk as a special case of a named Lambda function that requires no parameters but returns a predefined array when called with an empty parameter string.  A single Thunk can be created by including the array as the final (and only) parameter to LAMBDA.  It is also possible to pass the array as a parameter by using a further LAMBDA.

         

        Thunkλ
        = LAMBDA(x, LAMBDA(x))

         

        This may not look useful but it does allow and array of Thunks to be built using a Lambda helper function such as BYROW.

         

        Why might one use an array of Thunks?  An example might be to create an array of range references, something that Excel will not allow.  Instead, one may use an array of Thunks, anyone of which may be expanded to return a range reference.  For example

         

        tablesϑ
        = VSTACK(
              Thunkλ(Table1), 
              Thunkλ(Table2), 
              Thunkλ(Table3)
          )

         

        represents an array of 3 elements each of which is a thunk.  Where things get interesting, is that the array can be used as an argument in a normal Excel function such as XLOOKUP.  Thus

         

        = LET(
            selectedTblϑ, XLOOKUP(selectYr, years, tablesϑ),
            selectedTblϑ()
          )

         

         The first line of code returns a single thunk and the second expands it to return a range reference to the data within a table which may be on any sheet of the workbook.

         

        Alternatively

         

        = REDUCE(headers, tablesϑ,
            LAMBDA(acc,tblϑ, VSTACK(acc, tblϑ()))
          )

         

        will stack the contents of all three tables vertically as a single dynamic array.

        I don't know whether this helps, but it may give a flavour of what is possible!

Resources