Forum Discussion
How to count duplicates ... sort of ...
Greetings!
Here is part of a table that I am working on:
| RepairID | ModelID | Serial Number | Date Started | Date Tested | ReportedSymptom |
| SPEA-1010 | PT206 | 8FA552 | 10/22/2025 | 10/28/2025 | No Boot - White Screen |
| SPEA-1024 | PT206 | 8FA552 | 10/22/2025 | 11/4/2025 | No Boot - White Screen |
| SPEA-1037 | PT206 | 7C99BD | 10/22/2025 | 11/10/2025 | EGM No Comms |
| SPEA-1038 | PT206 | 715473 | 8/21/2025 | EGM No Comms | |
| SPEA-1039 | PT206 | 715473 | 11/10/2025 | 11/11/2025 | Failed Battery Test |
| SPEA-1056 | PT206 | 7142AE | 10/20/2025 | 11/12/2025 | Black Screen |
| SPEA-1057 | PT206 | 71584F | 10/20/2025 | 11/12/2025 | Black Screen |
| SPEA-1144 | PT206 | 7142AE | 11/24/2025 | 11/24/2025 | No Boot - Black Screen |
I am trying to count the number of repairs based on serial number in this table. But there are some duplicates. However, if you look at the rows with red font, you will see that the serial numbers, though they are the same, were not being repaired for the same reason. So, these need to be counted as 2 separate repairs. If the repairs to the same serial number are for the SAME symptom, we only count it once.
What would be a good approach to creating a function that will count this into a cell on a separate table?
Any assistance would be greatly appreciated.
Here is another way you may try-
=ROWS(UNIQUE(HSTACK(C.:.C,F.:.F)))-1
4 Replies
- m_tarlerBronze Contributor
So I think the other answers so far don't do what you want because I think you want the formula to be smart enough to see "Black Screen" as the SAME as "No Boot - Black Screen" which makes this problem expodentially harder! My first try was to make small tweak to Oliver's solution to allow wild card charaters on each side of the search but given multiple entries with partial matches like that make it hard to get a real value (i.e. each partial match is a fraction of a normal match but those fractions could add up to more than 1 so unclear how many partial matches there are. In the attached I show my new formula followed by all the 'other' formulas including the last one being this modification of Oliver's. I say this in case one of my colleagues here can find a more clever way to use that to get the correct answer.
That all said my solution is much more complicated:
=LET(in, CHOOSECOLS(A2:F10,3,6), uniqStar, LAMBDA(arr,LET(g,--ISNUMBER(SEARCH("*"&arr&"*",TRANSPOSE(arr))),corr,--(g+TRANSPOSE(g)>0),counts,MMULT(corr,EXPAND(1,ROWS(corr),1,1)),SUMPRODUCT(1/counts))), REDUCE(0,UNIQUE(TAKE(in,,1)),LAMBDA(p,q,p+uniqStar(FILTER(TAKE(in,,-1),TAKE(in,,1)=q)))))so line 1 just pulls the 2 columns of interest
line 2 creates a LAMBDA function that will compare each value in the array to each other value in the array and find partial matches and return a result of unique matches based on including partial matches. This is still not perfect (see example below)
line 3 then takes each Unique Serial Number and feed it into the new 'uniqStar' function and sums the result
in case the file doesn't attach right here is a screenshot:
so in the first example I added an extra line SPEA-1040 to further test the examples. in col G I have quick manual check on what I considered unique or if not unique how many dups.
In col I you can see the results of this new formula correctly finds 6 truely unique cases. The other formulas find 8 strictly unique cases and the last column M found 7 because of the 2 cases with partial matches each being 0.5 adding to 1 more.
In the last example you can see the partial matches are "Black Screen", "No Boot" , and "No Boot - Black Screen" and because the partial matches aren't 1 for 1 (i.e. "Black Screen" and "No Boot" both match "No Boot - Black Screen" but not each other while "No Boot - Black Screen" finds a partial match to each) you get a fraction added.
- Harun24HRBronze Contributor
- IlirUBrass Contributor
Apply this formula:
=COUNTA(UNIQUE(TRIMRANGE(CHOOSECOLS(A2:F10, {3,6})))) / 2HTH
Regards,
IlirU
- OliverScheurichGold Contributor
If i correctly understand 7 is the result for the sample dataset.
=SUMPRODUCT(1/COUNTIFS(C2:C9,C2:C9,F2:F9,F2:F9))