Forum Discussion
SkinnyMuldoon
Jun 28, 2022Copper Contributor
Editing database of 25000 Names and Addresses
Good morning,
I am wondering if anyone could point me in the direction of where to start in removing (nearly) duplicate rows from a names and addresses database?
Customers have registered with different email addresses but the same Name and Address but may have entered the details with spelling errors, or in different fields, for instance, so although they are the same person with the same postal address, we may have multiple records for them.
Having deleted all the exact duplicates, I now find myself trying to remove the non-exact duplicates; I have been attempting to find a way of comparing the First Name, Last Name and postcode of one row to any other row but can't find a formula to do this and I am not sure this is the most efficient method.
Am I making this more complicated than it should be? I am not a big EXCEL user and struggling to work through all the formulae etc, so any top tips would be so much appreciated?
Thank you 😄
Yes it can be done. In the attached file you can see the steps i've applied from sheet1 to sheet8.
5 Replies
Sort By
- OliverScheurichGold Contributor
=COUNTIFS($A$2:$A$19,A2,$B$2:$B$19,B2,$C$2:$C$19,C2)
You can try this formula for the attached example. If the result is greater than 1 the dataset for first name, last name and postal code appears more than once. You can adapt the ranges of the formula according to your requirement.
- SkinnyMuldoonCopper ContributorHi,
Thank you for taking the time to reply.
That would work to identify the duplicates - thank you - but then I will need to delete the duplicates/triplicates etc. Do you think this is possible?- OliverScheurichGold Contributor
Yes it can be done. In the attached file you can see the steps i've applied from sheet1 to sheet8.