# Filtering unordered records

Silver Contributor

# Filtering unordered records This post contains a number of solutions to a problem of sorting arbitrarily ordered colours to pick out distinct combinations.  I start with the formula as it should be, which sadly gives rise to the #CALC! 'Nested Arrays are not supported' error.  Since the solution to the problem is an array of arrays workarounds are required.

I have presented 4 possibilities.

1. Sort and combine each record using TEXTJOIN.  Use resulting string to return array from the original data;

2. Sort and combine each record into fixed length strings using Lambda functions.  Separate distinct combinations using the MID function acting of the array;

3. Sort and combine distinct records using REDUCE and VSTACK;

4. Sort (multiple times) and recombine distinct records using MAKEARRAY.

Which do you consider to be preferable?

Do you have other ideas?

7 Replies

# Re: Filtering unordered records

That's interesting to play with it

``````sortRows=
LAMBDA( range,
LET(
topRow, SORT( CHOOSEROWS( range, 1),,, 1 ),
IF( ROWS(range) = 1,
topRow,
VSTACK( topRow, sortRows( DROP( range, 1 )))
)
));

distinctSortedRows=
LAMBDA( range,
UNIQUE( sortRows( range )));

checkSameRows=
LAMBDA( horizontalVector, array,
BYROW( array, LAMBDA( r, XOR( r = horizontalVector) )) );

getCountriesForColourCombination=
LAMBDA( horizontalVector,
TEXTJOIN(",",1,
IF( checkSameRows( horizontalVector, sortRows(colours)  ),
country,""))
);

countriesByColourCombinations=
LET(
distinctColours, distinctSortedRows( colours ),
HSTACK( distinctColours,
BYROW( distinctColours, getCountriesForColourCombination ) )
);``````

# Re: Filtering unordered records

My first go at it I used CODE to identify where color combinations repeated.

Created a custom function to obtain the SUM of code by row:

``=LAMBDA(row,SUM(CODE(MID(row,SEQUENCE(LEN(row)),1))))``

Used the function in a formula to obtain the results. The formula is a bit lengthy and unkept at the moment, but it seems to work:

``=LET(CodeTotals,BYROW(colours,LAMBDA(row,SumCode(TEXTJOIN("",,row)))),ObtainRows,XMATCH(UNIQUE(CodeTotals),CodeTotals),CHOOSEROWS(colours,ObtainRows))``

The solution might be a bit more elegant, but I ran into the array of arrays limitation along the way where an XLOOKUP could've finished it off.

# Re: Filtering unordered records

Nice use of XOR, not seen that before...

I'd consider creating a generic version of BYROW that supports arrays

``````= LET(
sorted, BYROWλ(colours,
LAMBDA(r, SORT(r,,,1))),
UNIQUE(sorted))

BYROWλ
=LAMBDA(array,function,
REDUCE(1,
SEQUENCE(ROWS(array)),
LAMBDA(A,i,
IF(SEQUENCE(i)=i,
function(INDEX(array,i,)),
A))))``````

And hope that BYROWλ could be replaced by the real thing in the not too distant future!

In beta version, another option could be to unpivot and sort by column,

``````=LET(
unpivot, HSTACK(TOCOL(colours),TOCOL(IF({1},country,colours))),
sorted,  WRAPROWS(TAKE(SORT(unpivot,{2,1}),,1),COLUMNS(colours)),
UNIQUE(sorted)
)``````

# Re: Filtering unordered records

SORT( array, {2,1} ) is where I stopped previous time, thank you.

universalBYROW() - yes, I don't thin it'll be something out of the box in nearest future.

# Re: Filtering unordered records

Summing character codes appears to work in this case but not sure it generalises
eg "ad" and "bc" could return the same result (197)

A related idea could be to assign binary values to each colour.

i.e. black=1, blue=2, gold=4, green=8, red=16, white=32

``````=LET(
colour,SORT(UNIQUE(TOCOL(colours))),
code,2^SEQUENCE(ROWS(colour),,0),
HSTACK(colour,code))``````

Summing colour codes by row could generate a sort key independent of order (assuming colours don't repeat)

# Re: Filtering unordered records

Sound a bit like generating hash codes.

# Re: Filtering unordered records

Yes, it was in the back of my mind that it would be possible for 2 SUMS to be the same to lead to a false dupe.  It seems most solutions inevitably lead back to the same few workarounds (often involving TEXTJOIN).

A better approach I could go with:

``=LET(joined,BYROW(colours,LAMBDA(row,TEXTJOIN("",,SORT(row,,,1)))),SelectRows,XMATCH(UNIQUE(joined),joined),CHOOSEROWS(colours,SelectRows))``

Very similar to Peter's:

``= LET(joined, BYROW(colours, LAMBDA(r,TEXTJOIN("|",,SORT(r,,,1)))),ptr, XMATCH(UNIQUE(joined),joined),INDEX(colours,ptr,{1,2,3}))``

A bit of 'balloon folding' required with any of the solutions provided.