SOLVED

How do I choose which row to keep in Remove Duplicates?

Copper Contributor

Hiya.

 

I have several thousand rows that have data in the URL column in common.

 

I want to Remove Duplicates using data in this column to signify which row is a duplicate.

 

But data elsewhere in each row may be different.

 

While I want to remove duplicates based on that URL data, I also want to choose which rows to keep based on the contents of another cell.

 

  • Say two rows have the same URL - example.com
  • In the Paragraphs column, one row has 0, and the other has any other number.

How can I remove only the duplicate row that has 0 in the Paragraphs column?

 

Thank you!

3 Replies
best response confirmed by thacknology (Copper Contributor)
Solution

@thacknology

 

Hi,

 

This is can be done by using a formula in a helper column to identify the columns that you need to delete.

 

I suggest this formula:

=IF(COUNTIF($B$2:$B$10,B2)>1,IF(A2=0,"To be removed",""),"")

Delete specific columns.png

 

After that, you can sort the helper column from A to Z, then select the marked columns and delete them.

Delete specific columns2.png

 

 

Hope that helps

It's impossible that this answer can be bettered.

Thank you so much for taking the time to answer. It's greatly appreciated.

@Haytham Amairah Thank you for this information!  This solves part of my issue. I have an ID in Column A (some unique and some duplicates), Version number in column B (blank is the first version, then version 1, version 2, etc.) and Amount in column C.  I want to add the Amount in Column C for the highest Version (column B) for each ID (column A) only once. 

 

Also, it is possible that there may be more than one ID with the same version with different amounts.  Could these be flagged for human review to determine which one to sum e.g. #Error if they are the highest Version?

1 best response

Accepted Solutions
best response confirmed by thacknology (Copper Contributor)
Solution

@thacknology

 

Hi,

 

This is can be done by using a formula in a helper column to identify the columns that you need to delete.

 

I suggest this formula:

=IF(COUNTIF($B$2:$B$10,B2)>1,IF(A2=0,"To be removed",""),"")

Delete specific columns.png

 

After that, you can sort the helper column from A to Z, then select the marked columns and delete them.

Delete specific columns2.png

 

 

Hope that helps

View solution in original post