Feb 01 2024 04:58 AM
Hi
I have a sheet with 100,000+ rows and some of them are duplicates that I need to identify and delete from the system
how would i do it?
Feb 01 2024 05:47 AM - edited Feb 01 2024 05:48 AM
If you want to see all the lines that are duplicate you can use:
=FILTER(range, ISERROR(XMATCH(range, UNIQUE(range,,1))),"")
note that this will show ALL the duplicates so if you delete all the lines it found you wont have it in the list at all. So if "John" is listed 2x it will show "John" 2x. This could be useful if this is part of a bigger table and you need to decide which line(s) to delete (replace 1st 'range' with the table range to see more columns).
If you are just trying to get a 'clean' list without duplicates it would be much easier to just get the UNIQUE() list. Then if you want to delete the list with duplicates, copy the resulting UNIQUE list and Paste As Values and then you no longer need the original.
Feb 01 2024 06:29 AM
=IF(COUNTIF($A$2:$A$23,A2)>1,A2,"")
If the result has to be the same size of 100000+ rows you can apply this formula to remove the unique values or
=IF(COUNTIF($A$2:$A$23,A2)=1,A2,"")
to remove the duplicates.
=BYROW(A2:A23,LAMBDA(x,IF(COUNTIF(A2:A23,x)>1,x,"")))
This formula would spill the result but is only available in Office 365 and Excel for the web.
Spilled result with LAMBDA in Excel for the web or Office 365:
Result in legacy Excel e.g. Excel 2013:
Feb 01 2024 06:38 AM
Hi, @Eli_Cohen830,
1. The easiest way is to use the "Remove duplicates" button in the "Data" menu.
2. To view duplicate values, you can use the dedicated "Show duplicates" button in the "Conditional formatting" menu.
I recommend - above buttons add to the Quick Access Toolbar, e.g. like mine:
Regards,
Sergiusz