Jun 28 2022 04:53 AM
Good morning,
I am wondering if anyone could point me in the direction of where to start in removing (nearly) duplicate rows from a names and addresses database?
Customers have registered with different email addresses but the same Name and Address but may have entered the details with spelling errors, or in different fields, for instance, so although they are the same person with the same postal address, we may have multiple records for them.
Having deleted all the exact duplicates, I now find myself trying to remove the non-exact duplicates; I have been attempting to find a way of comparing the First Name, Last Name and postcode of one row to any other row but can't find a formula to do this and I am not sure this is the most efficient method.
Am I making this more complicated than it should be? I am not a big EXCEL user and struggling to work through all the formulae etc, so any top tips would be so much appreciated?
Thank you :D
Jun 28 2022 05:12 AM
=COUNTIFS($A$2:$A$19,A2,$B$2:$B$19,B2,$C$2:$C$19,C2)
You can try this formula for the attached example. If the result is greater than 1 the dataset for first name, last name and postal code appears more than once. You can adapt the ranges of the formula according to your requirement.
Jun 28 2022 05:26 AM
Jun 28 2022 05:57 AM
SolutionYes it can be done. In the attached file you can see the steps i've applied from sheet1 to sheet8.
Jun 28 2022 08:50 AM
Jun 28 2022 11:02 AM
Jun 28 2022 05:57 AM
SolutionYes it can be done. In the attached file you can see the steps i've applied from sheet1 to sheet8.