Forum Discussion
Power Query - Large Data Set Question
Hello - I combined a handful of .xlsx files and there's about 5 million rows total. I'm trying to identify duplicate "project IDs" (the long strings below in rows 22-29), basically what this filter is showing is all of the project IDs that have 3, 4, 5, 6, etc. duplicates all the way through 14. That's exactly what I'm looking for, there's about 180k project IDs I was able to get. The problem is the project IDs that have 1 duplicate, which is the number 2 in this filter because I did a group by in power query by the project ID and it counts the number of matching rows it has, so 2 means it has 1 duplicate. When I filter on this it runs over the 1 million excel row limit, I was wondering if anybody had an idea of how I could get around this problem?