Forum Discussion

major_johnson's avatar
major_johnson
Copper Contributor
Aug 28, 2021

Compare Two Columns in Two Sheets to Remove Duplicates and Return One Copy Only

Good Day, Friends!

I have one column in sheet1 and the other in sheet2.

Columns have duplicates, within itself as a column, as well as when both columns are compared to each other.

I would like to request to have one master column in sheet3 that would find and remove duplicates but keep one of its originals and return the text into the master column in sheet3.

Also, this master column should update and sort whenever any new names are added into either of those columns in sheet1 or sheet2.

Please note, I would especially like to have an automated column in sheet3, this means, I would like to request for an non-cse (control-shift-enter) array column.

I have attached the excel file for your review.

Thank you very much in advance for helping out, really!
  • major_johnson 

    SergeiBaklan's solution is perfect for your needs, as one might expect.  I redid the solution from scratch and there are barely any differences

     

    = LET(
        n₁, ROWS(Column1),
        n₂, ROWS(Column2),
        k, SEQUENCE(n₁+n₂),
        vstack, IF(k<=n₁, Column1, INDEX(Column2, k-n₁)),
        SORT(UNIQUE(vstack))
      )​

     If you are using anything but Excel 365, get rid of it and replace it by 365.  If you are using 365, make sure that you allow the formula to spill and do not use CSE.

    • major_johnson's avatar
      major_johnson
      Copper Contributor
      This is all great stuff, but guys, I want to do it on 2010 and with a formula.

      And yes, I do have 365 but on my personal laptop.

      This is for work and they are still using 2010.
      • PeterBartholomew1's avatar
        PeterBartholomew1
        Silver Contributor

        major_johnson 

        This takes me back to stuff I was thankful to see the back of, hopefully forever!

        First, defining  n₁, n₂ and 'combined'

        = ROWS(Table1)
        = ROWS(Table2)
        
        = IF(k<=n₁, 
            Column1, 
            IF(k<=n₁+n₂, 
              INDEX(Column2, k-n₁),
              ""
            )
          )

        one the uses MATCH to identify the first occurrence and SMALL to compact the list

        = IFERROR(
            INDEX(
              combined,
              SMALL(
                IF( MATCH(combined, combined, 0) = k, k, "" ),
                @k
              )
            ),
         "" )

        I have also shown a PowerQuery solution that works with Office 2010 but requires an additional software download.

        let
            Source = Table.Combine({Table1, Table2}),
            Distinct = Table.Distinct(Source),
            Text = Table.TransformColumnTypes(Distinct,{{"Data", type text}})
        in
            Text

        All of which serves to reinforce my dislike of conventional spreadsheets!

    • major_johnson's avatar
      major_johnson
      Copper Contributor
      Thank you very much for your reply.

      And although the what needed is 100% correct but I would really appreciate a formula instead of an array that is dragged down to a certain point and those cells are updated as accordingly.

      A formula that engages in a mixture of index and match is especially not control-shift-enter.

      I hope you understand.

      Thank you again for you help, I really do appreciate it!
      • SergeiBaklan's avatar
        SergeiBaklan
        MVP

        major_johnson 

        That is not CSE formula, that is regular formula which returns the spill automatically updated if you add new data into structured tables (Table1 and Table2). If I add some data into Table1

        and same into Table2

        resulting spill will be

Resources