The https://support.microsoft.com/en-us/office/scan-function-d58dfd11-9969-4439-b2dc-e7062724de29 =SCAN("",A1:C2,LAMBDA(a,b,a&b)) My general issue is combining BYROW() or BYCOLUMN() with other functions, but I'll illustrate it with this SCAN() example: How does one SCAN() row by row such that the output it something like:a, ab, abcd, de, def =BYROW(A1:C2,LAMBDA(row,SCAN("",row,LAMBDA(a,b,a&b)))) doesn't work...

TheDub It looks like it should work and, indeed, a similar formula with REDUCE does work. Unfortunately SCAN produces an array on each row, leading to an array of arrays which hasn't been implemented within the calc engine. There are two distinct approaches to overcome the problem, neither of which is that attractive. The first is to allow SCAN to run through the 6 values as a single sequence but to reset the resulting text to null whenever the scan returns to the first column. Rather than scanning the text array, I scanned the sequence {0,1,2;3,4,5} which can both be used to look up the text to concatenate and to identify the leading column.= SCAN("", SEQUENCE(2,3,0), LAMBDA(str,k, LET( row, 1+QUOTIENT(k,3), column, 1+MOD(k,3), chr, INDEX(data, row, column), IF(column>1,str,"")&chr ) ) )The second approach follows your original formula more closely but, instead of allowing SCAN to build a text array, it builds a Thunk that contains the array. BYROW then creates an array of 2 thunks and if the result is passed to MAKEARRAY each thunk may be singled out by INDEX, then expanded by passing it a null parameter string and wrapped within a further INDEX, which will return a single value to be assembled into the solution array.Thunkλ = LAMBDA(x,LAMBDA(x)) Forming a thunk: = LET( arrayϑ, Thunkλ(array), Returning its content: = arrayϑ()I will leave it to you to judge whether the most appropriate answer to your question is 'yes' or 'no'!

Can SCAN() and BYROW() be combined? | Microsoft Community Hub

44 Replies

Patrick2788

Silver Contributor

Apr 05, 2024

TheDub

My approach with SCAN:

ScanByrow
=LAMBDA(matrix,LET(
    counter, SEQUENCE(ROWS(matrix)),
    data, HSTACK(counter, matrix),
    AccumulateText, LAMBDA(a, v, IF(ISNUMBER(v), "", a & v)),
    Texts, SCAN("", data, AccumulateText),
    DROP(Texts, , 1)
))

ScanByrow.xlsx1.7 MB

Levi Bailey
Copper Contributor
Mar 24, 2024
TheDub PeterBartholomew1
Here's another abomination of an approach for combining SCAN() and BYROW():
=TEXTSPLIT(TEXTJOIN("~",TRUE,BYROW(A1:C2, LAMBDA(array,TEXTJOIN("|",TRUE,SCAN("",array,LAMBDA(a,b,a&b)))))),"|","~")
It gets around the type incompatibility by joining arrays into strings, processing them, then splitting the string back into an array. This example uses "|" as a column separator and "~" as a row separator.

I was only doing this as a means to an end, and did not optimise much - hopefully someone finds a way to make it such that there is only one TEXTJOIN.

A drawback of this approach is that everything gets compressed into a single string. This makes it hard to follow, and I have not checked but there is likely to be a limit to the length of the string that can be processed properly.

Like all the other approaches that I am aware of, it is unwieldy and I would prefer not to burden others with spreadsheets that contain this.

Has anyone found any better approaches since this thread was active a couple of years ago?
- karimbaez
  Copper Contributor
  Mar 25, 2024
  Levi BaileyI found a way using MakeArray
  
  =LET(data, SEQUENCE(2,3), MAKEARRAY(ROWS(data),COLUMNS(data), LAMBDA(row,col, CONCAT( INDEX(data,row,SEQUENCE(col)) ) ) ) )
  
  Found it on this YouTube video
  - SergeiBaklan
    Diamond Contributor
    Mar 29, 2024
    karimbaez
    
    As discussed above that's the slowest variant from all possible ones.
- PeterBartholomew1
  Silver Contributor
  Mar 25, 2024
  Levi Bailey
  Hi Levi.
  I wouldn't say that much has changed. It really is up to Microsoft to sort the mess that is of their making. I think the problem of helper functions not returning arrays of arrays is getting more important day by day. If one measures the need by the proportion of problems for which the array of arrays is a central feature of the problem which, as the scope of dynamic array formulas increases, may occur several times within a formula and resolving the issue becomes more important. Mind you, I would not expect ideas such as Craig Hatmaker 's 5g functions to fall within the original specification. Most spreadsheet users still are attached to horrible 'and send down' relative references, so it is only a minority that require the DA changes to be brought to fruition.
  
  I had made some progress towards overcoming the problem by using Thunks to hold the inner arrays, and having other user functions specifically written to expand the arrays of thunks to display the content.
  A generalised Lambda helper function that return arrays of arrays using bisection. - Microsoft Community Hub
  I still have some ideas to follow up, for example, expanding and stacking thunks 16 at a time, so 5 REDUCE steps would process an entire Excel column. The challenge is to both achieve acceptable efficiency and to hide the unsellable contortions from the end user!
karimbaez
Copper Contributor
May 05, 2023
HiTheDub try this method that I found on this youtube videohttps://www.youtube.com/watch?v=ITg5lfm3d14 it uses the MAKEARRAY function.

Replace this with your array "CHAR(RANDARRAY(10,10,65,90,TRUE))"

=LET( x,CHAR(RANDARRAY(10,10,65,90,TRUE)), MAKEARRAY(ROWS(x),COLUMNS(x), LAMBDA(r,c,CONCAT(INDEX(x,r,SEQUENCE(c))))))
PeterBartholomew1
Silver Contributor
Mar 28, 2022
TheDub
I created another approach to the problem in which many rows are stepped through the SCAN as a block (by again using thunks) and then displayed using MAKEARRAY.

= ExpandVλ(SCAN(,testdataVϑ,ConcatThunkλ)) Where ConcatThunkλ = LAMBDA(aϑ,bϑ,Thunkλ(aϑ()&bϑ())); testdataVϑ = BYCOL(testdata, Thunkλ); ExpandVλ = LAMBDA(dϑ, MAKEARRAY(n,10, LAMBDA(r,c, INDEX(INDEX(dϑ,1,c)(),r,1))) );

I then performed some timing runs using 10 columns but a varying number of rows.
What is troubling is that both this solution and the earlier one are of Order 1.7 (I am more used to seeing integer values). The earlier REDUCE formula appears to be more efficient than that using MAKEARRAY but, either way, the growth means that the approaches are not usable once the row count goes into the 1000s (in which case the time for a recalculation exceeds 1s).

My conclusion is that the array of array problem needs to the addressed by Microsoft. Otherwise we are likely to find ourselves in a situation of falling back on copy fill which will be a real joy when resizing calculations that are running into the 100,000s of rows!
- PeterBartholomew1
  Silver Contributor
  Mar 28, 2022
  The tags to the previous post were the creation of the cat!
  scan by row.xlsx35 KB
  - lori_m
    Iron Contributor
    Mar 28, 2022
    PeterBartholomew1
    Lol... until Microsoft comes up with a better alternative, I'd suggest using the SergeiBaklan formulation or something like below on longer arrays:
    =REDUCE(INDEX(data,,1), SEQUENCE(COLUMNS(data)-1), LAMBDA(acc,i, HSTACK(acc, INDEX(acc,,i)&INDEX(data,,i+1) )))
    Which performs better would depend on array dimensions, I'd think the method that stacks along the smaller of the row or column dimensions would be preferable. The othe advantage of thunks and the above method is that vector accumulation is supported.
    
    BTW, previous timing results that I posted would need to be updated by removing the 'vstack' name - turns out that prototype function was slowing things down a lot.
Starrysky1988
Iron Contributor
Mar 27, 2022
TheDub
AFE is not working in my Excel and I just use name manager.
byR = LAMBDA(L,MAKEARRAY(ROWS(L),COLUMNS(L),LAMBDA(r,c,CONCAT(INDIRECT(ADDRESS(r,1)&":"&ADDRESS(r,c))))))
The formula above will only work if the list is started from cell A1.
To get the dynamic result, I've added another helper name that represents the first cell in the list.
- SergeiBaklan
  Diamond Contributor
  Mar 27, 2022
  Starrysky1988
  
  I'd suggest to modify a bit which allows to work both with ranges and arrays.
  
  ByRR = LAMBDA(A, MAKEARRAY( rows(A), columns(A), LAMBDA( n, m, CONCAT( INDEX( A, n, SEQUENCE(m) ) ) ) ) );
  
  Performance is bit better than with thunks or makearray() which uses scan().
  
  byrows timer.xlsx17 KB
  - Starrysky1988
    Iron Contributor
    Mar 27, 2022
    SergeiBaklan
    Dear Mr. Sergei Baklan,
    Thanks for your highlight and I've learnt something from it. The formula now is perfect.
    What I like the most using Excel is the art of combining of different Excel Functions.
    If we are using old version of Excel 2016 onwards, we can just simply use "Concat" function as below and copy and paste to the cell we want.
    Sometimes, we make things easy, but sometimes we make them difficult.
    May be we want to save a few second of extra work and always want dynamic result.
lori_m
Iron Contributor
Mar 24, 2022
Following the announcement of the new array functions, I wondered if VSTACK or some other function could be put to use in this type of situation to extend BYROW?

In the absence of a built-in function, maybe define instead:
BYROWS =LAMBDA(array,function, REDUCE(1, SEQUENCE(ROWS(array)), LAMBDA(A,i, IF(SEQUENCE(i)=i, function(INDEX(array,i,)), A))))
which could be used in conjunction with SCAN as shown...
(Out of interest I did rough timings of the various implementations on a 100x100 array)

byrows.xlsx12 KB
- SergeiBaklan
  Diamond Contributor
  Mar 24, 2022
  lori_m
  
  My first attempt
  
  reduceV= LAMBDA(array, DROP( REDUCE( "", SEQUENCE(ROWS(array)), LAMBDA(a, n, VSTACK( a, SCAN(, CHOOSEROWS(array, n), LAMBDA(acc, v, acc & v)) ) ) ), 1 ) );
  
  which gives
  
  Not sure how to integrate it with timer.
  byrows timer.xlsx16 KB
  - lori_m
    Iron Contributor
    Mar 24, 2022
    SergeiBaklan
    Great - looks better. and likely faster too. For timing maybe replace "" by 0 (or 1) and '&' with '+' then update function name in timer formula. From the times, it looks like your system is running twice as fast! I needed to reduce to n=50 to make it responsive enough on my setup.
PeterBartholomew1
Silver Contributor
Mar 15, 2022
TheDub
OK. So I am beginning to get fed up with the arrays of arrays problem with the 365 lambda helper functions. Isn't that what one uses spreadsheets for?
Re: merging rows - Microsoft Tech Community
Every table is a list of records. Crosstabs are arrays of arrays. I would say that up to 80% of my workbooks would benefit from being able to handle arrays of arrays efficiently.

OK, thunks do offer a way around the problem and I am getting better at using them. The attached workbook has Lambda functions Thunkλ and ExpandThunkλ that I have used to form and expand the arrays of thunks, but it is still a pain and I struggle to see how I would convince a normal Excel user that this is better than 'and copy down (ugh!)'.
Consolidate by user.xlsx16 KB
lori_m
Iron Contributor
Mar 13, 2022
TheDub
I had also tried to raise this point several times prior to release. My hope was that something like this would work here,
=SCAN({"";""},data,LAMBDA(a,b,a&b)))
which would be a more efficient implementation of:
=REDUCE("", SEQUENCE(COLUMNS(data)), LAMBDA(acc,i, IF(SEQUENCE(,i)=i, INDEX(acc,,i-1)&INDEX(data,,i), acc)))
Instead a reusable 'SCANBYROW' function can be implemented via one of the methods Peter suggested, or formula above but this is far from ideal.
- TheDub
  Iron Contributor
  Mar 13, 2022
  As the kids nowadays say, SMH... Functions like that REDUCE() are some kind of abomination: six sub-functions (two of which are used twice), eight sets of "( )"s and an endless stream of ",". After a few of those, one's brain just shuts down 🙂
  - lori_m
    Iron Contributor
    Mar 13, 2022
    Fair enough, I think many other Excellers out there would agree with such sentiments! As per Peter's earlier comments, formula construction is starting to become more the domain of the developer community - similar examples of functional code abound on stackoverflow.
    
    The post was more to illustrate how an array of results could be generated in 2d rather than an array of 1d-arrays which is not allowed. If similar updates were made to SCAN/BYROW/etc functions one wouldn't need to resort to such headache-inducing formulas (though I'm sure the C++ source code of those functions would be many times worse!)
PeterBartholomew1
Silver Contributor
Mar 11, 2022
TheDub
It looks like it should work and, indeed, a similar formula with REDUCE does work. Unfortunately SCAN produces an array on each row, leading to an array of arrays which hasn't been implemented within the calc engine. There are two distinct approaches to overcome the problem, neither of which is that attractive.
The first is to allow SCAN to run through the 6 values as a single sequence but to reset the resulting text to null whenever the scan returns to the first column. Rather than scanning the text array, I scanned the sequence {0,1,2;3,4,5} which can both be used to look up the text to concatenate and to identify the leading column.
= SCAN("", SEQUENCE(2,3,0), LAMBDA(str,k, LET( row, 1+QUOTIENT(k,3), column, 1+MOD(k,3), chr, INDEX(data, row, column), IF(column>1,str,"")&chr ) ) )
The second approach follows your original formula more closely but, instead of allowing SCAN to build a text array, it builds a Thunk that contains the array. BYROW then creates an array of 2 thunks and if the result is passed to MAKEARRAY each thunk may be singled out by INDEX, then expanded by passing it a null parameter string and wrapped within a further INDEX, which will return a single value to be assembled into the solution array.
Thunkλ = LAMBDA(x,LAMBDA(x)) Forming a thunk: = LET( arrayϑ, Thunkλ(array), Returning its content: = arrayϑ()
I will leave it to you to judge whether the most appropriate answer to your question is 'yes' or 'no'!
- TheDub
  Iron Contributor
  Mar 11, 2022
  First - and obviously - thanks for you thoughtful reply. I'll play with both options against other examples, but my gut feeling is that #2 will come up ahead.
  
  Second - and this is sort of a general rant - having played around with many of the new shiny toys recently introduced to Excel, I must say the whole process feels strange, like "two steps forward - one step back". This one is a great example of Excel being inconsistent in applying its own logic: if something can be done by row (or by column) it should be always doable that way, instead of letting users find out (and memorizing) the circumstances where the logic works and where it doesn't. When you say, about the two approaches (in this case) to overcoming the problem, that "neither of which is that attractive" you are being, I feel, very polite. The near constant need to look for "alternatives", "workarounds", "hacks", etc. to resolve issues which should be straightforward is utterly exhausting....
  - PeterBartholomew1
    Silver Contributor
    Mar 11, 2022
    TheDub
    I see your point. The best I can say is that the pitfalls are relatively few.
    A bit like traditional Excel where one tries to explain why
    = COUNTIFS(MONTH(date),6))
    doesn't work. Because CONTIFS doesn't accept array parameters in place of a range reference always sounded a bit lame.
    
    In the present case, I got to
    = MAKEARRAY(2,3, Expandλ( BYROW(data, LAMBDA(row,Thunkλ(SCAN("", row, Concatλ)))) )) where Expandλ = LAMBDA(ϑ, LAMBDA(r,c, INDEX(INDEX(ϑ, r,1)(),,c) )); Concatλ = LAMBDA(array,cell, array&cell);
    It is heavy going but, once done, it can be reused. Hopefully, not too often!
    Array of Arrays using Thunks.xlsx13 KB

Forum Discussion

Can SCAN() and BYROW() be combined?

44 Replies

Resources