Jul 09 2022 01:24 PM - edited Jul 09 2022 01:26 PM
This post contains a number of solutions to a problem of sorting arbitrarily ordered colours to pick out distinct combinations. I start with the formula as it should be, which sadly gives rise to the #CALC! 'Nested Arrays are not supported' error. Since the solution to the problem is an array of arrays workarounds are required.
I have presented 4 possibilities.
1. Sort and combine each record using TEXTJOIN. Use resulting string to return array from the original data;
2. Sort and combine each record into fixed length strings using Lambda functions. Separate distinct combinations using the MID function acting of the array;
3. Sort and combine distinct records using REDUCE and VSTACK;
4. Sort (multiple times) and recombine distinct records using MAKEARRAY.
Which do you consider to be preferable?
Do you have other ideas?
Jul 09 2022 03:18 PM
That's interesting to play with it
sortRows=
LAMBDA( range,
LET(
topRow, SORT( CHOOSEROWS( range, 1),,, 1 ),
IF( ROWS(range) = 1,
topRow,
VSTACK( topRow, sortRows( DROP( range, 1 )))
)
));
distinctSortedRows=
LAMBDA( range,
UNIQUE( sortRows( range )));
checkSameRows=
LAMBDA( horizontalVector, array,
BYROW( array, LAMBDA( r, XOR( r = horizontalVector) )) );
getCountriesForColourCombination=
LAMBDA( horizontalVector,
TEXTJOIN(",",1,
IF( checkSameRows( horizontalVector, sortRows(colours) ),
country,""))
);
countriesByColourCombinations=
LET(
distinctColours, distinctSortedRows( colours ),
HSTACK( distinctColours,
BYROW( distinctColours, getCountriesForColourCombination ) )
);
Jul 09 2022 04:53 PM - edited Jul 09 2022 05:11 PM
My first go at it I used CODE to identify where color combinations repeated.
Created a custom function to obtain the SUM of code by row:
=LAMBDA(row,SUM(CODE(MID(row,SEQUENCE(LEN(row)),1))))
Used the function in a formula to obtain the results. The formula is a bit lengthy and unkept at the moment, but it seems to work:
=LET(CodeTotals,BYROW(colours,LAMBDA(row,SumCode(TEXTJOIN("",,row)))),ObtainRows,XMATCH(UNIQUE(CodeTotals),CodeTotals),CHOOSEROWS(colours,ObtainRows))
The solution might be a bit more elegant, but I ran into the array of arrays limitation along the way where an XLOOKUP could've finished it off.
Jul 10 2022 06:09 AM
Nice use of XOR, not seen that before...
I'd consider creating a generic version of BYROW that supports arrays
= LET(
sorted, BYROWλ(colours,
LAMBDA(r, SORT(r,,,1))),
UNIQUE(sorted))
BYROWλ
=LAMBDA(array,function,
REDUCE(1,
SEQUENCE(ROWS(array)),
LAMBDA(A,i,
IF(SEQUENCE(i)=i,
function(INDEX(array,i,)),
A))))
And hope that BYROWλ could be replaced by the real thing in the not too distant future!
In beta version, another option could be to unpivot and sort by column,
=LET(
unpivot, HSTACK(TOCOL(colours),TOCOL(IF({1},country,colours))),
sorted, WRAPROWS(TAKE(SORT(unpivot,{2,1}),,1),COLUMNS(colours)),
UNIQUE(sorted)
)
Jul 10 2022 06:48 AM
SORT( array, {2,1} ) is where I stopped previous time, thank you.
universalBYROW() - yes, I don't thin it'll be something out of the box in nearest future.
Jul 10 2022 10:46 AM - edited Jul 10 2022 10:54 AM
Summing character codes appears to work in this case but not sure it generalises
eg "ad" and "bc" could return the same result (197)
A related idea could be to assign binary values to each colour.
i.e. black=1, blue=2, gold=4, green=8, red=16, white=32
=LET(
colour,SORT(UNIQUE(TOCOL(colours))),
code,2^SEQUENCE(ROWS(colour),,0),
HSTACK(colour,code))
Summing colour codes by row could generate a sort key independent of order (assuming colours don't repeat)
Jul 10 2022 04:13 PM - edited Jul 10 2022 04:13 PM
Yes, it was in the back of my mind that it would be possible for 2 SUMS to be the same to lead to a false dupe. It seems most solutions inevitably lead back to the same few workarounds (often involving TEXTJOIN).
A better approach I could go with:
=LET(joined,BYROW(colours,LAMBDA(row,TEXTJOIN("",,SORT(row,,,1)))),SelectRows,XMATCH(UNIQUE(joined),joined),CHOOSEROWS(colours,SelectRows))
Very similar to Peter's:
= LET(joined, BYROW(colours, LAMBDA(r,TEXTJOIN("|",,SORT(r,,,1)))),ptr, XMATCH(UNIQUE(joined),joined),INDEX(colours,ptr,{1,2,3}))
A bit of 'balloon folding' required with any of the solutions provided.
Dec 08 2023 08:25 AM
Revisting this task with MAKEARRAY.
It's a bit slow but we can't all be lion tamers:
'UniqueByRow
=LAMBDA(matrix,LET(
height, ROWS(matrix),
width, COLUMNS(matrix),
Shuffle, LAMBDA(r, c,
LET(Sorted, SORT(CHOOSEROWS(matrix, r), , , 1), val, INDEX(Sorted, , c), val)
),
Results, MAKEARRAY(height, width, Shuffle),
UNIQUE(Results)
))
Dec 08 2023 02:25 PM
I like the definition of Shuffle prior to passing it as a parameter to MAKEARRAY (I am so conditioned to my own conventions that I had to rename it 'Shuffleλ').
As an aside, I downloaded a copy of my own problem definition file and reimplemented your formula in order to see it working step by step. On opening the file, it showed
The banner was a surprise, I have never seen this before.
Dec 08 2023 02:52 PM
Dec 08 2023 03:41 PM
Dec 09 2023 10:25 AM
It looks like I've joined the club:
It appears after submitting the formula it's then added as a private post in the feedback forum.
Mar 19 2024 11:08 AM - edited Mar 19 2024 11:11 AM
I'm going to drop another solution in the hopper:
UniqueX
=LAMBDA(matrix,LET(
r, ROWS(matrix),
c, COLUMNS(matrix),
autonum, TOCOL(SEQUENCE(r) * SEQUENCE(, c, 1, 0)),
vector, TOCOL(matrix),
sorted_vector, DROP(SORT(HSTACK(autonum, vector), {1, 2}), , 1),
wrapped, WRAPROWS(sorted_vector, c),
UNIQUE(wrapped)
))
Mar 20 2024 01:57 AM
@Patrick2788 Nice one. I like your autonum formula. 🙂
Here's another slight variation using SORTBY, plus GROUPBY to output the final summary:
=LET(
table, Table1[#All],
country, TAKE(table,, 1),
colors, DROP(table,, 1),
cols, COLUMNS(colors),
arr, TOCOL(colors),
rowId, TOCOL(IF(SEQUENCE(, cols), SEQUENCE(ROWS(colors)))),
sortRows, WRAPROWS(SORTBY(arr, rowId,, arr, ), cols),
GROUPBY(sortRows, country, ARRAYTOTEXT, 3, 0)
)
The GROUPBY function is not yet available to me, so I was unable to test it fully. Without GROUPBY, another option could be:
=LET(
table, Table1[#All],
country, TAKE(table,, 1),
colors, DROP(table,, 1),
cols, COLUMNS(colors),
arr, TOCOL(colors),
rowId, TOCOL(IF(SEQUENCE(, cols), SEQUENCE(ROWS(colors)))),
sortRows, WRAPROWS(SORTBY(arr, rowId,, arr, ), cols),
joinCols, REDUCE(CHOOSECOLS(sortRows, 1), SEQUENCE(cols-1,, 2), LAMBDA(v,n, v&"|"&CHOOSECOLS(sortRows, n))),
HSTACK(UNIQUE(sortRows), MAP(UNIQUE(joinCols), LAMBDA(v, ARRAYTOTEXT(FILTER(country, v=joinCols)))))
)