Weird Results from Array Manipulation - Any ideas on why?

Hi all - doing some exploring with lets and lambda in excel and passing arrays around. I found a weird quirk where something that should work doesn't work as expected in a surprising way. This prob...

excel

Formulas and Functions

djclements

Silver Contributor

Apr 15, 2024

joelb95 You are correct in your assessment that an undefined LAMBDA function returns nothing until it is called/evaluated. If it is never called, it will return the #CALC! error with the following screen tip:

"Cell contains a lambda... You can't define a lambda in a cell. Define your lambda in the Name Manager and try again using the name."

This is of course misleading, because a custom LAMBDA function can be defined directly in a cell using the LET function.

On its own, the formula =IF({1,0}, "a", LAMBDA(x, SEQUENCE(x))) returns {"a",#CALC!} in a 1 row by 2 column array. Using INDEX(array, 1) will return "a", and INDEX(array, 2) will return #CALC!; however, behind the #CALC! error sits the undefined function, LAMBDA(x, SEQUENCE(x)). When you call the function using INDEX(array, 2)(3), it is evaluated and returns {1;2;3}.

When looking at your original BYCOL example with the custom MANY function:

=LET(
    many, LAMBDA(choice, CHOOSE(choice + 1, 10, "b", "harry", "c", SEQUENCE(4, 1, 5), SEQUENCE(3), 45, "d", "lots", "e", "ten")),
    many_in_a_row, BYCOL(SEQUENCE(1, many(0)), LAMBDA(c, LAMBDA(choice, CHOOSE(choice, many(c))))),
    INDEX(many_in_a_row, 1)(1)
)

On its own, the many_in_a_row variable would return {#CALC!,#CALC!,...} in a 1 row by 10 column array. Behind each of those #CALC! errors, though, sits the undefined/uncalled function, LAMBDA(choice, CHOOSE(choice, many(c))), for each column in the array. However, the problem with the BYCOL function is that instead of passing a single value to the many function [many(1), many(2), etc.], it's actually passing an array for the entire column [many({1}), many({2}), etc.].

INDEX(many_in_a_row, 1)(1) returns {"b";"b";"b";"b"}, not because the other columns of BYCOL were evaluated (they weren't), but rather because an array object is being passed to the index_num argument of the underlying CHOOSE function on which the many function was based.

To further illustrate this point, swap out BYCOL with MAP:

=LET(
    many, LAMBDA(choice, CHOOSE(choice + 1, 10, "b", "harry", "c", SEQUENCE(4, 1, 5), SEQUENCE(3), 45, "d", "lots", "e", "ten")),
    many_in_a_row, MAP(SEQUENCE(1, many(0)), LAMBDA(c, LAMBDA(choice, CHOOSE(choice, many(c))))),
    INDEX(many_in_a_row, 1)(1)
)

With the MAP function, each iteration passes a single value to the many function [many(1), many(2), etc.]. As a result, INDEX(many_in_a_row, 1)(1) now returns "b", and INDEX(many_in_a_row, 4)(1) returns {5;6;7;8}.

TIP: regarding the INDEX function, when referencing a one-dimensional array or range that contains a single row, the row_num argument automatically becomes the column_num, if the optional [column_num] argument is not used. For example, INDEX(many_in_a_row, 4) returns the 4th column because the array only contains 1 row.

joelb95

Brass Contributor

Apr 15, 2024

"With the MAP function, each iteration passes a single value to the many function [many(1), many(2), etc.]. As a result, INDEX(many_in_a_row, 1)(1) now returns "b", and INDEX(many_in_a_row, 4)(1) returns {5;6;7;8}."

I haven't tested it yet, but this bit is fascinating. Not that I know what any of these words mean, but is the idea that the byrow/bycol are truly vectorized functions that somehow are running array logic rather than just itered scalar functions? I am not really sophisticated enough to implement the various timing functions I've seen, so I don't know how to answer the question, but do byrow/bycolumn have meaningful efficiencies over mapping a one by X row/column?

Maybe if I just use MAP I'll solve my problems, but if it comes at the cost of a huge loss of efficiency, it may not be all that helpful in the long run.

At some point I'll try to put into words what the bigger project is, but one portion relates to dynamically passing parameters and collecting the array results for later (or perhaps immediate) use. If I have a list of 10,000 things, I would, of course, prefer to use vectors whenever possible - but if what you say is so, then it may be a fool's errand.

Picture, if you will, a simple lambda that you want to extend indefinitely in a way similar to how =sum() can take as many arguments as you care to enter. How do you get an array of arrays to be properly "parameterized"? My current effort (and the genesis of this post) was about the weird result when I try to use byrow/bycolumn as a native excel iter counter. If it would just return the unevaluated formula, it would work swimmingly, but if it is going to return flattened arrays (however it gets there), it is unworkable.

Forum Discussion

Weird Results from Array Manipulation - Any ideas on why?

Resources