Filtering unordered records

Silver Contributor



This post contains a number of solutions to a problem of sorting arbitrarily ordered colours to pick out distinct combinations.  I start with the formula as it should be, which sadly gives rise to the #CALC! 'Nested Arrays are not supported' error.  Since the solution to the problem is an array of arrays workarounds are required.


I have presented 4 possibilities.

1. Sort and combine each record using TEXTJOIN.  Use resulting string to return array from the original data;

2. Sort and combine each record into fixed length strings using Lambda functions.  Separate distinct combinations using the MID function acting of the array;

3. Sort and combine distinct records using REDUCE and VSTACK;

4. Sort (multiple times) and recombine distinct records using MAKEARRAY.

Which do you consider to be preferable?

Do you have other ideas?

7 Replies

@Peter Bartholomew 

That's interesting to play with it

LAMBDA( range,
        topRow, SORT( CHOOSEROWS( range, 1),,, 1 ),
        IF( ROWS(range) = 1,
            VSTACK( topRow, sortRows( DROP( range, 1 )))

LAMBDA( range,
    UNIQUE( sortRows( range )));

LAMBDA( horizontalVector, array,
    BYROW( array, LAMBDA( r, XOR( r = horizontalVector) )) );

LAMBDA( horizontalVector,
        IF( checkSameRows( horizontalVector, sortRows(colours)  ),

  distinctColours, distinctSortedRows( colours ),
  HSTACK( distinctColours,
    BYROW( distinctColours, getCountriesForColourCombination ) )

@Peter Bartholomew 

My first go at it I used CODE to identify where color combinations repeated.

Created a custom function to obtain the SUM of code by row:





Used the function in a formula to obtain the results. The formula is a bit lengthy and unkept at the moment, but it seems to work:





The solution might be a bit more elegant, but I ran into the array of arrays limitation along the way where an XLOOKUP could've finished it off.

@Sergei Baklan 

Nice use of XOR, not seen that before...


@Peter Bartholomew 

I'd consider creating a generic version of BYROW that supports arrays

= LET(
   sorted, BYROWλ(colours,
      LAMBDA(r, SORT(r,,,1))),


And hope that BYROWλ could be replaced by the real thing in the not too distant future!


In beta version, another option could be to unpivot and sort by column,

unpivot, HSTACK(TOCOL(colours),TOCOL(IF({1},country,colours))),
sorted,  WRAPROWS(TAKE(SORT(unpivot,{2,1}),,1),COLUMNS(colours)),



SORT( array, {2,1} ) is where I stopped previous time, thank you.

universalBYROW() - yes, I don't thin it'll be something out of the box in nearest future.


Summing character codes appears to work in this case but not sure it generalises
eg "ad" and "bc" could return the same result (197)


A related idea could be to assign binary values to each colour. 

i.e. black=1, blue=2, gold=4, green=8, red=16, white=32




 Summing colour codes by row could generate a sort key independent of order (assuming colours don't repeat)

Sound a bit like generating hash codes.


Yes, it was in the back of my mind that it would be possible for 2 SUMS to be the same to lead to a false dupe.  It seems most solutions inevitably lead back to the same few workarounds (often involving TEXTJOIN).


A better approach I could go with:




Very similar to Peter's:


= LET(joined, BYROW(colours, LAMBDA(r,TEXTJOIN("|",,SORT(r,,,1)))),ptr, XMATCH(UNIQUE(joined),joined),INDEX(colours,ptr,{1,2,3}))



A bit of 'balloon folding' required with any of the solutions provided.