Forum Discussion
tamlynn
Jun 26, 2023Copper Contributor
Unique row values
Hi there! I need some help with a large excel doc in office 365. It has 50,000 rows and 10 columns (the first two columns are first name and last name). The issue: names are appearing in duplicate, t...
Lorenzo
Jun 27, 2023Silver Contributor
Hi tamlynn
Your link doesn't work and IMHO mock data are useless when it comes to identifying duplicates, especially with text values where the unexpected can be expected :))
With 50k rows attached is a Power Query option that does - on the first 2 columns:
- Clean as much as possible invisible/unprintable chars
- Remove extra spaces everywhere
- Capitalize 2nd column (if you want Last Name this can be changed easily)
NB
- If you have exceptions like "DA - VINCI", "DA-VINCI" and "DA - VINCI" you'll get 2 distints
- If your data result of a copy/paste from the Web or another application it remains possible that some invisible/unprintable chars won't be removed and consequently impact the result
How to use:
- Download the attached file
- Copy/Paste as values your data in TableSource (sheet Data Source)
(you can rename the table column names the query will auto. adjust)
- Go to Excel Data tab > Refresh All
You get 2 results: Distincts & Duplicates
(In TableSource examples all colored cells have extra spaces here & there)