Mar 11 2022 10:15 AM
The support page for SCAN() contains the following example:
=SCAN("",A1:C2,LAMBDA(a,b,a&b))
My general issue is combining BYROW() or BYCOLUMN() with other functions, but I'll illustrate it with this SCAN() example:
How does one SCAN() row by row such that the output it something like:
a, ab, abc
d, de, def
=BYROW(A1:C2,LAMBDA(row,SCAN("",row,LAMBDA(a,b,a&b))))
doesn't work...
Mar 24 2022 09:51 AM
Great - looks better. and likely faster too. For timing maybe replace "" by 0 (or 1) and '&' with '+' then update function name in timer formula. From the times, it looks like your system is running twice as fast! I needed to reduce to n=50 to make it responsive enough on my setup.
Mar 24 2022 12:24 PM
Mar 25 2022 01:19 AM
Yup looks pretty fast! In any case, native byrow/bycol extended functions would get my vote to bypass the nested array error in general, as suggested in the original post. Then other methods - including thunks - could benefit from being able to process arrays of results more efficiently.
Mar 25 2022 11:55 AM
I could agree, but everything depends on goals and concrete scenario.
Mar 25 2022 03:41 PM
I have seen the posts but have yet to work through them in detail. Meanwhile I have tried VSTACK as an alternative to MAKEARRAY, but still using Thunks. The worksheet formula relies on a number of other named Lambda functions.
= Expandλ(BYROW(data,Scanλ))
Scanλ
= LAMBDA(row,Thunkλ(SCAN("",row,Concatλ)));
Concatλ
= LAMBDA(a,b,a&b);
Expandλ
= LAMBDA(dϑ,
REDUCE(,dϑ,LAMBDA(accϑ,ϑ,Thunkλ(VSTACK(accϑ(),ϑ()))))()
);
Mar 25 2022 04:59 PM
@TheDub was pretty irritated by the message "Nested arrays are not supported", and I believe rightly so. Most Excel errors report an error on the part of the user/developer. This one is different. The correct answer to a wide class of problems to which one might with to apply dynamic array methods is the array of arrays. This error is nothing to do with the user, it is simply flagging up a deficiency in the code; one which may not have mattered in the past but which is now centre stage.
As a minimum, BYROW, BYCOL and MAKEARRAY should address this limitation and convert the 'column of row arrays' to the equivalent 2D array without involving the user in workarounds.
Mar 26 2022 12:23 AM - edited Mar 26 2022 12:24 AM
Agreed, the previous post was intended to be fully in support of the OP and to reiterate that the proposed BYROW formula should really be made to work in a future update if possible.
I'd really prefer that nested arrays were converted to 2d arrays natively. Google sheets does this eg:
={{1,2};{3,4}} -> {1,2;3,4}
={{1,2},{3,4}} -> {1,2,3,4}
so no need for separate VSTACK / HSTACK functions. I gather from the response you received from the development team that backward compatibility makes such an update more challenging however.
Mar 26 2022 02:05 AM
@Peter Bartholomew , I had no doubt you will use thunks. Did you try to integrate with @lori_m timer?
Mar 26 2022 02:17 AM
Afraid implementation of array of arrays requires re-building of calc engine architecture. Other words create new Excel, if not from scratch but close to that. Perhaps I'm wrong, just my impression.
Other way is to built in workarounds like we use now in built-in functions, but that will be quite narrow solution which works for concrete cases only.
Anyway, I believe this or that solution will be found and we'll see array of arrays one day. Such functionality is quite demanded for new Excel.
Mar 26 2022 03:31 PM
No, I need to find some time to work through Lori's solution. I did some timing runs with my solution using 100x100 and 200x50 grids of single characters using Charles Williams's profiler. The latter was worse for a range recalc. with timings of 350ms rather than 250ms.
Mar 27 2022 01:22 AM
The timer code is courtesy of the MS Research team and was used for prototyping some of the new functions. Without access to the new functions I substituted the vstack prototype function from that link and got the results shown in attachment which are a little disappointing and I'm sure thunk results could be improved significantly though unlikely ever to match the faster alternatives listed.
I've no idea how much effort would be required to address the nested arrays error but community feedback like this can only help bump it up the priority list!
Mar 27 2022 02:14 AM
Oops, I didn't open the link within your file. Yes, looks like unchanged samples from the prototype.
Prioritization is never simple task. I guess Microsoft decides such compromises every few months (have no idea about their planning process). If to simplify, like make Evaluate Formula box resizable or introduce few other lambda functions.
My another guess on first place is what demanded by corporate business which gives 85% (third guess) of income.
Anyway, sooner or later nested arrays shall appear.
Mar 27 2022 03:17 AM - edited Mar 27 2022 05:38 AM
AFE is not working in my Excel and I just use name manager.
byR = LAMBDA(L,MAKEARRAY(ROWS(L),COLUMNS(L),LAMBDA(r,c,CONCAT(INDIRECT(ADDRESS(r,1)&":"&ADDRESS(r,c))))))
The formula above will only work if the list is started from cell A1.
To get the dynamic result, I've added another helper name that represents the first cell in the list.
Mar 27 2022 07:38 AM - edited Mar 27 2022 07:38 AM
I'd suggest to modify a bit which allows to work both with ranges and arrays.
ByRR = LAMBDA(A,
MAKEARRAY(
rows(A),
columns(A),
LAMBDA( n, m,
CONCAT( INDEX( A, n, SEQUENCE(m) ) )
)
)
);
Performance is bit better than with thunks or makearray() which uses scan().
Mar 27 2022 09:05 AM
Dear Mr. Sergei Baklan,
Thanks for your highlight and I've learnt something from it. The formula now is perfect.
What I like the most using Excel is the art of combining of different Excel Functions.
If we are using old version of Excel 2016 onwards, we can just simply use "Concat" function as below and copy and paste to the cell we want.
Sometimes, we make things easy, but sometimes we make them difficult.
May be we want to save a few second of extra work and always want dynamic result.
Mar 28 2022 01:25 AM
I created another approach to the problem in which many rows are stepped through the SCAN as a block (by again using thunks) and then displayed using MAKEARRAY.
= ExpandVλ(SCAN(,testdataVϑ,ConcatThunkλ))
Where
ConcatThunkλ = LAMBDA(aϑ,bϑ,Thunkλ(aϑ()&bϑ()));
testdataVϑ = BYCOL(testdata, Thunkλ);
ExpandVλ = LAMBDA(dϑ,
MAKEARRAY(n,10, LAMBDA(r,c, INDEX(INDEX(dϑ,1,c)(),r,1)))
);
I then performed some timing runs using 10 columns but a varying number of rows.
What is troubling is that both this solution and the earlier one are of Order 1.7 (I am more used to seeing integer values). The earlier REDUCE formula appears to be more efficient than that using MAKEARRAY but, either way, the growth means that the approaches are not usable once the row count goes into the 1000s (in which case the time for a recalculation exceeds 1s).
My conclusion is that the array of array problem needs to the addressed by Microsoft. Otherwise we are likely to find ourselves in a situation of falling back on copy fill which will be a real joy when resizing calculations that are running into the 100,000s of rows!
Mar 28 2022 01:28 AM
The tags to the previous post were the creation of the cat!
Mar 28 2022 02:08 AM
Lol... until Microsoft comes up with a better alternative, I'd suggest using the @Sergei Baklan formulation or something like below on longer arrays:
=REDUCE(INDEX(data,,1),
SEQUENCE(COLUMNS(data)-1),
LAMBDA(acc,i,
HSTACK(acc,
INDEX(acc,,i)&INDEX(data,,i+1)
)))
Which performs better would depend on array dimensions, I'd think the method that stacks along the smaller of the row or column dimensions would be preferable. The othe advantage of thunks and the above method is that vector accumulation is supported.
BTW, previous timing results that I posted would need to be updated by removing the 'vstack' name - turns out that prototype function was slowing things down a lot.
Mar 30 2022 03:14 PM
Your recommendation was spot on. The outcome seems to be that the formula now available will have one dimension that grows quadratically whilst the other may be linear. The formula you suggested and attributed to @Sergei Baklan grows linearly as the number of rows increases and beats the formulae I was playing with hands down; it should get to a million rows with calculation times under 1s!
Similar behavior is achieved by the variation
= REDUCE(INDEX(data,,1), BYCOL(data,Thunkλ),
LAMBDA(acc,ϑ, HSTACK(acc,TAKE(acc,,-1)+ϑ()))
)
I hadn't spent enough time on the final posts of the first sheet because I was going to use Charles Williams's timing routines and the Microsoft Research timed regression tests made the workbook very slow.
May 05 2023 02:56 PM - edited May 05 2023 03:07 PM
Hi@TheDub try this method that I found on this youtube video https://www.youtube.com/watch?v=ITg5lfm3d14. it uses the MAKEARRAY function.
Replace this with your array "CHAR(RANDARRAY(10,10,65,90,TRUE))"
=LET(
x,CHAR(RANDARRAY(10,10,65,90,TRUE)),
MAKEARRAY(ROWS(x),COLUMNS(x),
LAMBDA(r,c,CONCAT(INDEX(x,r,SEQUENCE(c))))))