Dec 03 2023 07:13 AM - edited Jan 05 2024 10:22 AM
http://www.myonlinetraininghub.com/excel-formulas-to-summarise-monthly-data-into-quarters
Above is the link to reference article about which I have some questions that I am posting below.
SumQtrsSergei.xlsx
THUNKS
1] For multi-row and multi-column input to INDEX function, giving row no and column no as input will output single cell value. For input of column no as zero, entire row will be given as output. For single column and multi-row input, giving column no as either 0 or 1 or not giving any column no does not matters. It will give output value corresponding to the row no. Despite sumsArray being a single column and multi-row array of error cells, giving 0 or not giving column no to INDEX function throws #VALUE! error as output. Column no input must be 1. Why is that?
2] INDEX (sumsArray, 4, 1)(). INDEX function extracts cell values only after placing empty bracket in the syntax not otherwise. This is very bizarre. What’s the logic behind this?
3] sumsArray:
a) Single LAMBDA throws error in single cell however it is still holding all the values with that single cell of error.
b) Double LAMBDA throws error in multi-row single column. No of rows = No of rows of data. Each error cell is now holding data/values of i] Corresponding to that row no and column no = 1 and ii] Entire row. This double lambda concept is new to me and I would like to know about its functioning, logic, purpose and application.
c) Why second or inner LAMBDA has no declaration of variable v inside it?
d) sumsArray results into a column of error cells. How can a formula extract a non-error output of error cells or using error cells as input?
4] sumsB: Declaration of variable v inside second or inner LAMBDA of sumsArray throws error output. Why is that despite declaration of variable is in the standard syntax of LAMBDA?
-----------------------------------------------------------------------------------------------
Double LAMBDA + No () for INDEX: Gives column of #CALC! error but each of these error cells either 1) Do not pack the cell values of the row that corresponds to the respective error cell of the column or 2) They do pack or hold them but does not reveal them that is output them after giving input of row and column no of a cell in the array to INDEX function.
Double LAMBDA + () in the INDEX: It reveals/outputs cell value of the cell in the grid/array after giving input of row and column no of desired cell to INDEX function but there is a prerequisite that you must first input (n, 1) [where n = row no] to INDEX function so that column no becomes 1 which represents column no of error cell (CEC) output of Double LAMBDA formula. So you must give column no = 1 first and then row no as n to select corresponding error cell of CEC which packs/holds all cells with their values in that row. Then next after this one row output, you input (1, m) to second INDEX function to select that row with row no = 1 that is the row itself (Error cell with row no n from CEC) and m = column no of cell to select cell in the row whose value you want by giving its column no m.
-----------------------------------------------------------------------------------------------
MAKEARRAY1
If MMULT (data, --(TRANSPOSE(MonthsInQuarters)=quarters)) can deliver quarterly sum for each row then why you need MAKEARRAY, LAMBDA, INDEX, BYROW and all that fuss to get this?
-----------------------------------------------------------------------------------------------
MAKEARRAY2
If we include a cell address of cell containing text in sum function along with cell address of other cells containing numbers then it delivers the sum of all numbers without giving any error by ignoring cells which contain text. Cell address of cell containing text can be either specified individually or be included in a range of cells containing numbers. If on the other hand if we add a text in the function like sum(“TEXT”, A1:A5) where A1:A5 contain numbers then the function throws error. Why it doesn’t throw error if we input text value through cell address instead of directly as text string in double quotation?
=======================================================
SumQtrsPeter.xlsx
EXPAND function turns blank into 0 when delivering output directly in the cells on sheet whereas keeping blank as blank when delivering output to another function. Why is this difference?
=======================================================
It will be very nice if someone can answer atleast some of these questions.. I have same two files that are available in the tutorial but containing each step that I have unfolded to understand how each of these formulas work. I have attached the files.
@Peter Bartholomew
Dec 03 2023 11:56 AM
THUNKS
1] To an extent the answer is 'because that is the way it was implemented'. In normal use, INDEX is applied to a range and the result is also a range object, be it a row, column or cell. Once the function is applied to an array, the rules change. To return a multi-element array, the indices need to be explicit, for example
= INDEX(array, {3;2;1}, 5)
Now Lambda functions are first class objects that may be included as parameters or returned as the result of a formula, the rules seem to have changed again and both row and column parameters are required even if one has to be 1.
2] Moving to the next issue,
= INDEX(sumsArray, 4, 1)
is valid, but it returns a Lambda function (specifically a thunk) and not an array of values. Lambda functions are only evaluated when the necessary parameters are provided. In the case of the Thunk, no parameters are required but the evaluation is then triggered by the null parameter string:
= INDEX(sumsArray, 4, 1)()
You could even split the calculation in to two parts
= LET(
arrϑ, INDEX(sumsArray, 4, 1),
arrϑ()
)
if you so chose.
The main limitation that is being addressed by the use of thunks is that (as currently implemented) Lambda helper functions are not capable of returning arrays of arrays without contorted workarounds. Given that I would define the modern spreadsheet as an environment for manipulating multi-dimensional arrays as 2D arrays of arrays, this is an appalling limitation. Most of the formulas I write require the array of arrays, quite often at multiple points within a given formula.
Note: Traditional spreadsheets approach the problem differently by working from the cell upwards. Arrays are displayed by writing 10s of 1000s of individual scalar formulas that reproduce the appearance of the result array without any explicit recognition given to the structure of the data being manipulated.
Dec 03 2023 12:01 PM
Dec 03 2023 04:36 PM
Dec 03 2023 11:33 PM
Dec 04 2023 10:25 AM - edited Dec 04 2023 10:35 AM
Do you mean that if Lambda function is included in the array then one of the parameter that is either the row or the column argument of the INDEX function must be 1 to produce the output without throwing any error?
Is this the double Lambda setup is called as thunk or is it the double Lambda's wrapped inside BYROW function is called as thunk? Can you explain more about this thunk concept as without understanding of which it's difficult to understand the answers of the questions and to get the hang of the formula.
How is that the formula INDEX(sumsArray, 4, 1) returns a Lambda function and not an array of values consequently requiring a null parameter string? I see that sumsArray = Double Lambda's wrapped inside BYROW function and BYROW function fulfills the parameter passing requirement of Lambda function by feeding 'data' as parameter to the double Lambda's.
Dec 04 2023 10:30 AM
Dec 04 2023 02:38 PM
Thanks for your response.
I am afraid I can not provide more information because LAMBDA only available on o365 PC version but I have no PC on hand.
I guess if you transpose the raw data layout from multiple columns to multiple raws as :
date item quantity
.
You can sum them by pivottable or sql group by quarter.
Dec 04 2023 02:56 PM - edited Dec 04 2023 02:59 PM
I regard a Thunk as a special case of a named Lambda function that requires no parameters but returns a predefined array when called with an empty parameter string. A single Thunk can be created by including the array as the final (and only) parameter to LAMBDA. It is also possible to pass the array as a parameter by using a further LAMBDA.
Thunkλ
= LAMBDA(x, LAMBDA(x))
This may not look useful but it does allow and array of Thunks to be built using a Lambda helper function such as BYROW.
Why might one use an array of Thunks? An example might be to create an array of range references, something that Excel will not allow. Instead, one may use an array of Thunks, anyone of which may be expanded to return a range reference. For example
tablesϑ
= VSTACK(
Thunkλ(Table1),
Thunkλ(Table2),
Thunkλ(Table3)
)
represents an array of 3 elements each of which is a thunk. Where things get interesting, is that the array can be used as an argument in a normal Excel function such as XLOOKUP. Thus
= LET(
selectedTblϑ, XLOOKUP(selectYr, years, tablesϑ),
selectedTblϑ()
)
The first line of code returns a single thunk and the second expands it to return a range reference to the data within a table which may be on any sheet of the workbook.
Alternatively
= REDUCE(headers, tablesϑ,
LAMBDA(acc,tblϑ, VSTACK(acc, tblϑ()))
)
will stack the contents of all three tables vertically as a single dynamic array.
I don't know whether this helps, but it may give a flavour of what is possible!
Dec 10 2023 10:34 AM - edited Dec 10 2023 10:36 AM
@PeterBartholomew
@PeterBartholomew1
Peter, you said when we change from range reference to an array then argument specified must be either 1 or 0 and empty argument is invalid or unacceptable however I see this is not the case as I have given an array to INDEX function in the formula below and it returns desired outcome even if I keep column argument empty.
INDEX (({1; 2; 3; 4}*4/2)+2, 3) this returns 8 or
INDEX ((B1:B4*4/2)+2, 3) returns 8.
This is contrary to what you have said, will you please explain.
Next you said if the array contains the LAMBDA then empty or even 0 argument is not acceptable. It has to be 1 representing the row or column of the array. Which kind of LAMBDA array are you talking about? Is it the array returned by the LAMBDA function or the array of the LAMBDA’s that is array of the thunks for which INDEX argument must be 1 and cannot be empty or 0?
Dec 10 2023 03:14 PM
Apologies. You appear to be correct. I had memory of arrays not returning the same result as ranges when called with a parameter omitted but that appears not to be the case. In the formula
= INDEX({10,20,30;40,50,60;70,80,90}, {1;2;3}, {1,3})
omitting the second parameter no longer returns whole columns, but that is true of a range reference too. The point with the Thunks was simply as you had noted
= LET(
Arrayϑ, BYROW(array, LAMBDA(x, LAMBDA(x))),
Elementϑ, INDEX(Arrayϑ, 2),
Elementϑ()
)
fails with a #REF! error whereas
= LET(
Arrayϑ, BYROW(array, LAMBDA(x, LAMBDA(x))),
Elementϑ, INDEX(Arrayϑ, 2, 1),
Elementϑ()
)
works. What is annoying is that
= LET(
Arrayϑ, BYROW(array, LAMBDA(x, LAMBDA(x))),
MAP(Arrayϑ, LAMBDA(Elementϑ, Elementϑ()))
)
should just return the original array but instead it errors, complaining that nested arrays (the intended result) are not allowed.
Jan 05 2024 10:18 AM - edited Jan 05 2024 10:33 AM
@PeterBartholomew1
Let’s assume, No of rows = m and no of columns = n, m > 1 and n > 1.
For either single column range or array input, giving column no argument of INDEX function as 1 or 0 or omitted is acceptable. For 2D range of m X n, omitting the column no argument results into #REF! error whereas for 2D array of m X n, it returns either entire row or first element of the row depending upon whether or not you are enclosing the row no argument inside curly braces. Logic behind not throwing #REF! error for 2D array input with column no argument omitted is unknown or the Microsoft is only privy to. For a single column of Thunks as an argument for array parameter of INDEX function, column no argument cannot be 0 or omitted and must be 1 to select a single Thunk from the array of Thunks. They could have programmed to allow 0 or omitted but they didn’t.
----------------------------------------------------------------------------------------------
I see Thunk as facility for packing and storing the array of raw or processed data.
I don’t understand why even single Thunk packed by BYROW function is required to be selected by INDEX function before it is unpacked or released. This sounds very illogical and incomprehensible. For example,
LET (THUNK, BYROW ({1, 2, 3}, LAMBDA (x, LAMBDA (x))), THUNK()).
This fails to release the THUNK and spill the array on the sheet. For this we need to select the Thunk with INDEX function first before we unpack and spill it on the sheet as is in the below code.
LET (THUNK, BYROW ({1, 2, 3}, LAMBDA (x, LAMBDA (x))), INDEX (THUNK,1, 1)()).
----------------------------------------------------------------------------------------------
MAP function can return only one element (single cell value) for each element passed to it. Hence it fails to return n elements of 1 X n array (Packed inside one Thunk by BYROW) for each Thunk passed by it.
LET (THUNK, {1, 2, 3}, BYROW (array, LAMBDA (x, LAMBDA (x))),
MAP (THUNK, LAMBDA (Elementθ, Elementθ()))) fails to return or unpack and spill the 3 elements of 1 X 3 array packed in the single Thunk by BYROW.
LET (THUNK, {1, 2, 3; 4, 5, 6}, BYROW (array, LAMBDA (x, LAMBDA (x))),
MAP (THUNK, LAMBDA (Elementθ, Elementθ()))) fails to return 2 X 3 array from two Thunks each containing 1 X 3 array.
We must make sure that we pass only single cell value to MAP function for each element passed by MAP to LAMBDA in order to get all the elements of m X n array packed inside m no of Thunks by BYROW function. Each Thunk containing 1 X n array. Following formula takes care of this:
LET (THUNK, {1, 2, 3; 4, 5, 6}, BYROW (array, LAMBDA (x, LAMBDA (x))),
MAP (THUNK, LAMBDA (Elementθ, INDEX (Elementθ(), 1, COLUMN(A1)))))
You need to drag this formula around till you get all the elements of m X n array.
----------------------------------------------------------------------------------------------
Same is the case with MAKEARRAY. It can only return a single cell value for each (row, column) argument that it passes to LAMBDA. It cannot return n elements of 1 X n array (packed by BYROW into a single Thunk) for one (row, column) argument that it passes to LAMBDA hence following formula fails to spill second Thunk from a 2 X 1 array of Thunks (THUNK) containing two Thunks in a single column.
LET (THUNK, {1, 2, 3; 4, 5, 6}, BYROW (array, LAMBDA (x, LAMBDA (x))),
MAKEARRAY (2, 1, LAMBDA (m, n, INDEX (THUNK, m, n)()))).
We need to make sure that we return a single cell value to MAKEARRAY for each (row, column) argument that it passes to LAMBDA so that it will be able to return or spill that single value on sheet. This is taken care of by following formula:
LET (THUNK, {1, 2, 3; 4, 5, 6}, BYROW (array, LAMBDA (x, LAMBDA (x))),
MAKEARRAY (2, 3, LAMBDA (m, n, INDEX (INDEX (THUNK, m, 1)(), 1, n)))).
----------------------------------------------------------------------------------------------
Obviously as a renowned formula guru of this forum you must be cognizant about all this. I just wanted to post my comprehensions. Thanx very much for enlightening me about the concept of Thunk.
Jan 05 2024 03:58 PM
Sadly, I do not have any special insight concerning the problem. It might be no more than a case of Excel not recognising the compound data type 128 so long as there is a possibility that it might be an array 64. Only once INDEX has operated on the potential array does it become a scalar and show as a type 128, and so eligible to be treated as a lambda function.
The attached file uses recursive bisection to turn a row of thunks into an array of arrays using HSTACK (the data was scanned from a LinkedIn article, I will reference it if I find it again). It seems that a single index can b enough if Excel is expecting a row array, but I am not sure how long that has been the case!
= LET(
orderedByϑ, MAP(products, ThunkFilter),
names, MAPλ(orderedByϑ, LAMBDA(ϑ, ϑ())),
IFERROR(names, "")
)
ThunkFilterλ
= LAMBDA(FILTER(Customer, Order = item))
MAPλ
= LET(
n₀, COLUMNS(array),
n₁, QUOTIENT(n₀, 2),
n₂, n₀ - n₁,
A₁, TAKE(array, 1, n₁),
A₂, TAKE(array, 1, -n₂),
X₁, IF(n₁ > 1, MAPλ(A₁, FNλ), FNλ(INDEX(A₁, 1))),
X₂, IF(n₂ > 1, MAPλ(A₂, FNλ), FNλ(INDEX(A₂, 1))),
HSTACK(X₁, X₂)
)
Note: I should expand the MAPλ code to bisect by row as well as by column and accept multiple arguments, but that is a task for another day!