Forum Discussion

gouravsaini's avatar
gouravsaini
Copper Contributor
Jul 05, 2022
Solved

Remove Duplicates from multiple columns

I have data in which i have duplicates but in different columns for ex :

Coulmn AColumn B
12
21
34
43

but now my output should look like

Column AColumn B
12
34

how can i achieve this using Excel or Sql ?

  • gouravsaini 

    This is one of the situations is made difficult by the fact that Excel array formulas throw a wobbly when faced with nested arrays.  Clearly one would like to sort each row and then extract the unique rows but that results in a completely unnecessary #CALC! error.  

    One workaround is to concatenate the sorted rows and apply the UNIQUE function to the resulting strings.  I then search for the distinct strings within the full set to return the index of the first occurrence.  From there, back to the original data knowing the row numbers.

    = LET(
         joined, BYROW(data, LAMBDA(r, TEXTJOIN("|",,SORT(r,,,1)))),
         ptr,    XMATCH(UNIQUE(joined),joined),
         INDEX(data,ptr,{1,2})
      )

    The stated reason for the limitation is that "Excel has never supported arrays of arrays", but then, the support for arrays was somewhat dubious.  So, in order to ensure backward compatibility with legacy solutions, we make the present difficult and store up 'backward compatibility' problems for the future.

1 Reply

  • gouravsaini 

    This is one of the situations is made difficult by the fact that Excel array formulas throw a wobbly when faced with nested arrays.  Clearly one would like to sort each row and then extract the unique rows but that results in a completely unnecessary #CALC! error.  

    One workaround is to concatenate the sorted rows and apply the UNIQUE function to the resulting strings.  I then search for the distinct strings within the full set to return the index of the first occurrence.  From there, back to the original data knowing the row numbers.

    = LET(
         joined, BYROW(data, LAMBDA(r, TEXTJOIN("|",,SORT(r,,,1)))),
         ptr,    XMATCH(UNIQUE(joined),joined),
         INDEX(data,ptr,{1,2})
      )

    The stated reason for the limitation is that "Excel has never supported arrays of arrays", but then, the support for arrays was somewhat dubious.  So, in order to ensure backward compatibility with legacy solutions, we make the present difficult and store up 'backward compatibility' problems for the future.

Resources