SOLVED

Removing duplicates

Copper Contributor

 

How do I remove duplicate values associated with a name only?  I have a number of the duplicate values that are repeated across a table range and correspond to different names (in the column to the right of the values.)

 

I only want to delete duplicate values associated with each corresponding name only, and not all duplicate values in the table range.  Thanks.

 

11 Replies

Hey @MMcDonal 

 

Try This:

 

* Select the range you want to remove the duplicates from, Go into the Data Tab, under the data tools group click Remove Duplicates:

* Check the two column you want to remove the Duplicates from and click OK

 

1.PNG

 

 

@Excel_World_Champion 

 

Thanks EWC.  I believe your method would remove all duplicate values irrespective of the duplicate values based on different sources.

 

See screenshot below of a larger spreadsheet where there are duplicate values.  My goal is to remove only duplicate values by source only.  So in this case, removal of all duplicate values by source (name), including the $3,000 duplicate value for GenD, but not for Henry S. and HewP since they are separate sources/names. 

 

The $3,000 values for GenD and HenryS should remain intact/not removed, since they are separate sources of the duplicate values within the range (spreadsheet).

 

Removal of Duplicate Values per Each Source.png

 

Thanks!

 

MMcDonal

@MMcDonal 

To remove by source select all columns in your range and check only one with source

image.png

Result is

image.png

@Sergei Baklan 

 

Thank you Sergei, particularly since it is a function I will need to use repeatedly on numerous similar projects going forward.  

@MMcDonal , Power Query could be a variant

@Sergei Baklan

 

Hi Sergei.  Coming back to you after 4 months. Hope all's well with you. 

 

You were most helpful to me in July.  Was hoping you could you help me out with the following?

 

I have a large spreadsheet with multiple duplicate values from the same client source. However, there are different policies assigned to the client source that I do not wish to remove when applying the  "remove duplicates" function.

 

The attached sample shows duplicate values for different policies for the same client name. The source of the duplicate values (losses) is the client name and one or several of the policy (numbers) for that client name. For my needs, it does not matter which of the policies make up the duplicate value. 

 

I would like to retain all policies relating to the same client source but reflect the duplicate value only once, as shown in the attached sample. 

 

Thanks. 

 

 

@MMcDonal 

In your latest sample it's not removing of duplicates for which you remove entire rows based on some criteria. Here you'd like to empty repeating values in first and second columns keeping the rest untouched.

Better to do that with formulas in these first columns. Logic of formula depends on logic of the data structuring - will it be sorted/filtered or not; is it necessary to treat differently zero values and empty cells; shall we empty these values only for sequentially repeating values in 4th column or that doesn't matter. For example, in your sample you empty values for GGG Demolition, Inc. but not for Bairos Recycling Inc, which also is repeating with non-zero values, but not one by one.

 

For example, in A3 it could be

=IF( (D3=D2)*(A2>0),"", SUMPRODUCT(('N:\Audit Database\2019 Audits\Los Angeles\November 2019\Environmental\Losses\[Environmental Loss Run 10.14.19.xlsx]Sheet1'!$A:$B=D3)*1))

, similar in B3 and we may drag them down.

 

Perhaps above formulas shall be more complex, but again, it depends logic of data itself.

Thanks Sergei. I will take a look at your formula and see if it works for my purpose.
You raised some good questions: (a) The data does not have to be filtered/sorted; (b) there is no difference between cells with zero values and empty cells. (I removed the repeated (non-zero) values for GGG Demolition under to "To This" data set and left those cells blank/empty, and (c) I overlooked the second/lower $810 entry for Bairos Recycling, which I should have cleared/deleted under the "To This" data group, same as I did with GGG Demolition.
Again, thanks.
best response confirmed by MMcDonal (Copper Contributor)
Solution

@MMcDonal 

Thank you for the clarification. With this I'd use the formula as

=IF( COUNTIF($D$2:$D2,$D3),0, SUMPRODUCT(('N:\Audit Database\2019 Audits\Los Angeles\November 2019\Environmental\Losses\[Environmental Loss Run 10.14.19.xlsx]Sheet1'!$A:$B=D3)*1))

and apply to column custom number format as

General;General;"-"??;@

to hide zeros. 

@Sergei Baklan 

 

This is [very] good.   Thanks Sergei for your time, effort, and expertise.

 

Michael

@MMcDonal 

Michael, you are welcome

1 best response

Accepted Solutions
best response confirmed by MMcDonal (Copper Contributor)
Solution

@MMcDonal 

Thank you for the clarification. With this I'd use the formula as

=IF( COUNTIF($D$2:$D2,$D3),0, SUMPRODUCT(('N:\Audit Database\2019 Audits\Los Angeles\November 2019\Environmental\Losses\[Environmental Loss Run 10.14.19.xlsx]Sheet1'!$A:$B=D3)*1))

and apply to column custom number format as

General;General;"-"??;@

to hide zeros. 

View solution in original post