SOLVED

EXCEL - Remove 1st record of Duplicates and retain the remaining records

New Contributor

Hi There.  I have an excel table with 3 fields: Email, Date, TOTAL.

 

There are many repeating email addresses, for example abc@gmail.com may have multiple purchases. I would like to remove the first record for each duplicate email and retain every other record. So for example, if abc@gmail.com made 4 purchases, I want to remove the first instance record and then retain the last 3 instances.  I want to remove the record where Total is €100 and keep the next 3 records so that I can do a calculation to sum totals for remaining records, in this example add 50+80+200.  In summary, I want to retain all records, except the first record where there is a duplicate. Many Thanks

4 Replies

@eithneYOT 

Add a column to the right and name it Remove.

Enter the following formula in D2:

=AND(A2=A3,A2<>A1)

Excel should automatically propagate it to the rows below.

S0906.png

Filter the Remove column for TRUE.

S0907.png

Select the rows except for the header row, and delete them.

You can now delete the Remove column; this will automatically remove the filter.

S0908.png

@Hans Vogelaar, Thank you so much.  I am very grateful for you prompt response.  This is so simple and effective.  Hans, They have now changed their criteria, would you mind looking at question below?

 

In the example attached for abc@gmail.com, there are 4 records, with 2 different Transaction No's.

abc@gmail.com has 2 separate transactions, first transaction no is 12345 with 2 purchases ~(€100 & €50).  I want to remove all records for the first transaction no. and retain all subsequent records for that email address.  There can be multiple records per Transaction No..  For 123@hotmail.com, there are 3 transactions.  I want to remove the first record (i.e. for transaction 23232) and retain subsequent records. 

@eithneYOT 

You can use the same idea, but with a different formula in the Remove column.

If you have Microsoft 365 or Office 2021:

=SUM(--(LEN(UNIQUE(FILTER(B$5:B5,C$5:C5=C5,"")))>0))=1

S0909.png

If you have an older version, use the following array formula, confirmed with Ctrl+Shift+Enter:

=SUM(--(FREQUENCY(IF(C$5:C5=C5,MATCH(B$5:B5,B$5:B5,0)),ROW(B$5:B5)-ROW(B$5)+1)>0))=1

Again, filter for TRUE, remove the remaining rows, then delete the extra column.

best response confirmed by eithneYOT (New Contributor)
Solution

@Hans Vogelaar THANK YOU - You are a star