Nov 18 2022 10:08 AM - edited Nov 18 2022 07:31 PM
In a nutshell, I am trying to compute average with a combination of ByRow, Index, and Average. I needed a sequence number array for this algorithm. When I put this sequence number in a range and use that range in ByRow function, it works great. But when I use that sequence function directly inside a Let(), it gives me wrong answers.
I am sharing the codes I have tried, below:
This is the first formula I wrote:
=LET(
Values, SORTBY(Data[Sensex], Data[DATE]),
Sqn, SEQUENCE(ROWS(Values)),
BYROW(
Sqn,
LAMBDA(a,
IF(
a < 7,
"NA",
AVERAGE(
INDEX(Values, SEQUENCE(7, 1, a - 7 + 1))
)
)
)
)
)
This one produced "a" set of answers. But the answers were slightly incorrect. To figure out where the problem was, all I did was put the sequences in a separate range and use that range in the ByRow function and this time I got perfect results:
So, this is the first part to get the sequence of numbers
In cell, say E4
=LET(
Values, SORTBY(Data[Sensex], Data[DATE]),
Sqn, SEQUENCE(ROWS(Values)),
Sqn
)
And this is the second part, where I use the ByRow function and specify the range as an array
In cell, say F4
=LET(
Values, SORTBY(Data[S&P BSE Sensex], Data[DATE]),
BYROW(
E4#,
LAMBDA(a,
IF(
a < 7,
"NA",
AVERAGE(
INDEX(Values, SEQUENCE(7, 1, a - 7 + 1))
)
)
)
)
)
And now I am getting the perfect answer. Is this a bug or am I doing something wrong here:
@Peter Bartholomew @Craig Hatmaker @Sergei Baklan @lori_m,
Nov 18 2022 10:22 AM - edited Nov 18 2022 10:33 AM
I don't have an answer why right now but your 'wrong' answer are just your original values NOT averaged offset by 7. That is probably a good clue what is happening.
One more clue: This variation also works:
=LET(Values, SORTBY(Data[S&P BSE Sensex],Data[DATE]),
Sqn, SEQUENCE(ROWS(Values)),
Seven, SEQUENCE(7),
BYROW(Sqn, LAMBDA(a,
IF(a<7,"NA", AVERAGE(INDEX(Values, a-7+Seven))))))
Basically setting up the {1,2,3,4,5,6,7} array ahead of time and then simple math inside
Nov 18 2022 10:33 AM
Nov 18 2022 10:51 AM
@Viz a couple more interesting nuances. Sending an arbitrary array reference in and then converting that to a number array using ROW() works:
=LET(Values, SORTBY(Data[S&P BSE Sensex],Data[DATE]),
sqn,A1:INDEX(A:A,ROWS(Values)),
BYROW(sqn, LAMBDA(in,
LET(a,ROW(in),
IF(a<7,"NA", AVERAGE(INDEX(Values, SEQUENCE(7, 1, a-7+1))))))))
but converting that cell reference outside the LAMBDA doesn't work:
=LET(Values, SORTBY(Data[S&P BSE Sensex],Data[DATE]),
sqn,A1:INDEX(A:A,ROWS(Values)),
BYROW(ROW(sqn), LAMBDA(a,
IF(a<7,"NA", AVERAGE(INDEX(Values, SEQUENCE(7, 1, a-7+1)))))))
Nov 18 2022 11:53 AM
BYROW() assumes you transfer an array as parameter, even if that array has only one element. SEQUENCE() from array works differently, you may compare =SEQUENCE(2) and =SEQUENCE({2}).
To fix, you may pass first element of each row array
=LET(
Values, SORTBY(Data[S&P BSE Sensex], Data[DATE]),
Sqn, SEQUENCE(ROWS(Values)),
BYROW(
Sqn,
LAMBDA(a,
IF(
@a < 7,
"NA",
AVERAGE(
INDEX(Values, SEQUENCE(7, 1, @a - 7 + 1))
)
)
)
)
)
or simply change BYROW on MAP
=LET(
Values, SORTBY(Data[S&P BSE Sensex], Data[DATE]),
Sqn, SEQUENCE(ROWS(Values)),
MAP(
Sqn,
LAMBDA(a,
IF(
a < 7,
"NA",
AVERAGE(
CHOOSEROWS(
Values,
SEQUENCE(7, 1, +a - 7 + 1)
)
)
)
)
)
)
which passes to lambda value, not array.
Nov 18 2022 01:35 PM
Nov 18 2022 02:41 PM
@Sergei Baklan seems to have got to the bottom of the problem. Yet another example of the failure of the Calc engine to address the problem of arrays of arrays (that is, 'normal' problems).
I do not have much extra to offer; I was interested in trying to master the new AFE. I decided that the problem is better suited to MAP, so bypassed the problem rather than addressing it. Rather than showing the anonymous Lambda function on the worksheet, I defined a Named Lambda 'AverageLast7λ' in its Curried form, giving the worksheet formula
=LET(
Values, TAKE(SORT(Data), , -1),
Sqn, SEQUENCE(ROWS(Values)),
MAP(Sqn, AverageLast7λ(Values))
);
AverageLast7λ
= LAMBDA(v, LAMBDA(a,
IF(a<7,"NA",
AVERAGE(INDEX(v, SEQUENCE(7, 1, a,-1)))
)
));
The SORT and AVERAGE formulas are different, but it is simply change for change's sake.
Nov 18 2022 02:55 PM
AverageLast7λ
= LAMBDA(v, LAMBDA(a,
IF(a<7,"NA",
AVERAGE(TAKE(TAKE(v, a),-7))
)
));
also worked.
Nov 18 2022 02:55 PM
@mtarler wrote:
When it's a range, why is the "array of one" treated differently?
A range pointing to a single cell isn't an array, it's a scalar. Try TYPE(A1:A1) & TYPE(A1:A2) & TYPE({1}) & TYPE({1,2}). I wish single element arrays were implicitly converted to scalars.
@mtarler wrote:
As for what is actually happening here is a sample to help:
=AVERAGE(SEQUENCE(5,1,2+{5})) vs =AVERAGE(SEQUENCE(5,1,2+5))
My observed behaviour (not sure if it's documented or a bug), is when a function that expects scalar arguments receives an array, it outputs an array with the same shape applying the function.
I.e., SEQUENCE({3}) is the same as MAP({3}, LAMBDA(x, SEQUENCE(x)))
The problem is, when the function returns an array itself like SEQUENCE, you're creating an array of arrays which can't be displayed in the grid. In the former case it just uses the first element in the inner array. In the later, MAP, case it gives #CALC! "Nested arrays aren't supported".
Related question here too TEXTSPLIT combined with BYROW returns an unexpected result when using - Microsoft Community
Nov 19 2022 03:37 AM
That's better to ask @JoeMcDaid for more professional explanation.
My guess if we have BYROW on spill, aka array, it takes each element of such array which is value. Each element of the range is an array, auto-lifting doesn't work in such case.
As for the SEQUENCE not sure how it works. With an array in step it looks like it returns only the value for the first step.
Nov 19 2022 03:39 AM
@Peter Bartholomew , hope you are happy with new AFE. Had no time to dig it in details, first impression is quite positive.
Nov 19 2022 04:06 AM
Thanks for the detailed explanation and the solution, Sergei.
I am still trying to process your explanation fully. But after reading your explanation to @mtarler and his work around with row function, I think I am getting some sense of the problem.
I am wondering if ByRow is treating the sequence as matrix instead of a vector when we directly pass it in the function (and therefore the array or array problem).
Am I going in the right direction?
Nov 19 2022 04:14 AM
Nov 19 2022 04:36 AM
@Viz , not sure I understood correctly. IMHO, BYROW() never passes the matrix since that's by row. From the grid it takes an array (vector) since the row in the grid is always an array. From the array it takes current value of the array, it could be value or vector. That's our job to transform such vector/value into single value with lambda, otherwise we have array of arrays.
Nov 19 2022 07:16 AM - edited Nov 19 2022 07:17 AM
I followed @JamesDuley 's link and found a related question from a personal friend of some 20 years standing. In answering his question, I advanced my own understanding of the problem
TEXTSPLIT combined with BYROW returns an unexpected result when using - Microsoft Community
My conclusions were that BYROW is designed to accept a 2D matrix and pass it to the calculation, row at a time. In the special case of the matrix having only a single column, BYROW will pass single values to the calculation, but they will still be
'Type = 64' and will get caught by any (monumentally undesirable) legacy nested array checks. Passing the single value (1x1) array can be made to work if you always reference its value using '@', in order to convert it to a scalar value.
Where BYROW references a Range, things proceed somewhat differently. In normal use BYROW passes row ranges to the calculation engine (COUNTIFS will work with such an object as its criterion range for example) but, in the special case, the row might reduce to a single cell reference.
= TYPE(cell)
will then return a 1 or 2, so the calculation proceeds differently depending on whether the initial BYROW reference is an array or a range object.
Nov 19 2022 08:06 AM
@Peter Bartholomew Oh, I guess I've seen related post from James here on MTC
Nov 19 2022 11:12 AM
@Peter Bartholomew wrote:Where BYROW references a Range, things proceed somewhat differently. In normal use BYROW passes row ranges to the calculation engine (COUNTIFS will work with such an object as its criterion range for example) but, in the special case, the row might reduce to a single cell reference.
= TYPE(cell)
will then return a 1 or 2, so the calculation proceeds differently depending on whether the initial BYROW reference is an array or a range object.
That's an interesting observation. It looks like BYROW internally does INDEX(array, row_num, 0) for each row to pass to the function. INDEX, like this, returns a scalar if, and only if, array is a single column and it's a reference. This is different from the almost equivalent CHOOSEROWS(array, row_num) which always returns an array.
Nov 20 2022 10:54 AM
Mar 12 2023 10:10 AM
I have a similar question. Suppose I have an array (A1#) such that:
2 |
3 |
4 |
I need to create a sequence that in the end looks something like this:
2 |
2 |
3 |
3 |
3 |
4 |
4 |
4 |
4 |
My thought was the use BYROW(A1#,LAMBDA(x,SEQUENCE(1,4,x,0))) so that I get:
2 | 2 | N/A | N/A |
3 | 3 | 3 | N/A |
4 | 4 | 4 | 4 |
And then feed that into TOCOL() and use a FILTER() to filter out the "N/A"s. But the function throws a CALC error.
Any thoughts?
Mar 12 2023 11:14 AM