Compare Two Columns in Two Sheets to Remove Duplicates and Return One Copy Only

Copper Contributor
Good Day, Friends!

I have one column in sheet1 and the other in sheet2.

Columns have duplicates, within itself as a column, as well as when both columns are compared to each other.

I would like to request to have one master column in sheet3 that would find and remove duplicates but keep one of its originals and return the text into the master column in sheet3.

Also, this master column should update and sort whenever any new names are added into either of those columns in sheet1 or sheet2.

Please note, I would especially like to have an automated column in sheet3, this means, I would like to request for an non-cse (control-shift-enter) array column.

I have attached the excel file for your review.

Thank you very much in advance for helping out, really!
10 Replies

@major_johnson 

As variant that could be

=LET(
  ra, ROWS(Table1),
  rb, ROWS(Table2),
  k, SEQUENCE(ra+rb),
  SORT( UNIQUE( IF(k<=ra, INDEX(Table1[Column1], k), INDEX(Table2[Column2], k-ra))))
)

which returns

image.png

Thank you very much for your reply.

And although the what needed is 100% correct but I would really appreciate a formula instead of an array that is dragged down to a certain point and those cells are updated as accordingly.

A formula that engages in a mixture of index and match is especially not control-shift-enter.

I hope you understand.

Thank you again for you help, I really do appreciate it!
Also, after double checking, the master column in sheet3 does not update or correspond whenever I add or amend items.

Again, please, I need a formula.

@major_johnson 

That is not CSE formula, that is regular formula which returns the spill automatically updated if you add new data into structured tables (Table1 and Table2). If I add some data into Table1

image.png

and same into Table2

image.png

resulting spill will be

image.png

@Sergei Baklan 

 

Please see the attached file.

Oh, I am sorry, I am using MS Pro. Plus 2010. :(

@major_johnson 

@Sergei Baklan's solution is perfect for your needs, as one might expect.  I redid the solution from scratch and there are barely any differences

 

= LET(
    n₁, ROWS(Column1),
    n₂, ROWS(Column2),
    k, SEQUENCE(n₁+n₂),
    vstack, IF(k<=n₁, Column1, INDEX(Column2, k-n₁)),
    SORT(UNIQUE(vstack))
  )​

 If you are using anything but Excel 365, get rid of it and replace it by 365.  If you are using 365, make sure that you allow the formula to spill and do not use CSE.

This is all great stuff, but guys, I want to do it on 2010 and with a formula.

And yes, I do have 365 but on my personal laptop.

This is for work and they are still using 2010.

@major_johnson 

This takes me back to stuff I was thankful to see the back of, hopefully forever!

First, defining  n₁, n₂ and 'combined'

= ROWS(Table1)
= ROWS(Table2)

= IF(k<=n₁, 
    Column1, 
    IF(k<=n₁+n₂, 
      INDEX(Column2, k-n₁),
      ""
    )
  )

one the uses MATCH to identify the first occurrence and SMALL to compact the list

= IFERROR(
    INDEX(
      combined,
      SMALL(
        IF( MATCH(combined, combined, 0) = k, k, "" ),
        @k
      )
    ),
 "" )

I have also shown a PowerQuery solution that works with Office 2010 but requires an additional software download.

let
    Source = Table.Combine({Table1, Table2}),
    Distinct = Table.Distinct(Source),
    Text = Table.TransformColumnTypes(Distinct,{{"Data", type text}})
in
    Text

All of which serves to reinforce my dislike of conventional spreadsheets!