Forum Discussion

dolfinmagikpro's avatar
dolfinmagikpro
Copper Contributor
Nov 24, 2025
Solved

How to count duplicates ... sort of ...

Greetings!

Here is part of a table that I am working on:

RepairIDModelIDSerial NumberDate StartedDate TestedReportedSymptom
SPEA-1010PT2068FA55210/22/202510/28/2025No Boot - White Screen
SPEA-1024PT2068FA55210/22/202511/4/2025No Boot - White Screen
SPEA-1037PT2067C99BD10/22/202511/10/2025EGM No Comms
SPEA-1038PT2067154738/21/2025 EGM No Comms
SPEA-1039PT20671547311/10/202511/11/2025Failed Battery Test
SPEA-1056PT2067142AE10/20/202511/12/2025Black Screen
SPEA-1057PT20671584F10/20/202511/12/2025Black Screen
SPEA-1144PT2067142AE11/24/202511/24/2025No Boot - Black Screen
      
 

I am trying to count the number of repairs based on serial number in this table.  But there are some duplicates.  However, if you look at the rows with red font, you will see that the serial numbers, though they are the same, were not being repaired for the same reason.  So, these need to be counted as 2 separate repairs.  If the repairs to the same serial number are for the SAME symptom, we only count it once.

What would be a good approach to creating a function that will count this into a cell on a separate table?

Any assistance would be greatly appreciated.

4 Replies

  • m_tarler's avatar
    m_tarler
    Bronze Contributor

    So I think the other answers so far don't do what you want because I think you want the formula to be smart enough to see "Black Screen" as the SAME as "No Boot - Black Screen" which makes this problem expodentially harder!  My first try was to make small tweak to Oliver's solution to allow wild card charaters on each side of the search but given multiple entries with partial matches like that make it hard to get a real value (i.e. each partial match is a fraction of a normal match but those fractions could add up to more than 1 so unclear how many partial matches there are.  In the attached I show my new formula followed by all the 'other' formulas including the last one being this modification of Oliver's.  I say this in case one of my colleagues here can find a more clever way to use that to get the correct answer.

    That all said my solution is much more complicated:

    =LET(in, CHOOSECOLS(A2:F10,3,6), 
    uniqStar, LAMBDA(arr,LET(g,--ISNUMBER(SEARCH("*"&arr&"*",TRANSPOSE(arr))),corr,--(g+TRANSPOSE(g)>0),counts,MMULT(corr,EXPAND(1,ROWS(corr),1,1)),SUMPRODUCT(1/counts))),
    REDUCE(0,UNIQUE(TAKE(in,,1)),LAMBDA(p,q,p+uniqStar(FILTER(TAKE(in,,-1),TAKE(in,,1)=q)))))

    so line 1 just pulls the 2 columns of interest

    line 2 creates a LAMBDA function that will compare each value in the array to each other value in the array and find partial matches and return a result of unique matches based on including partial matches.  This is still not perfect (see example below)

    line 3 then takes each Unique Serial Number and feed it into the new 'uniqStar' function and sums the result

    in case the file doesn't attach right here is a screenshot:

    so in the first example I added an extra line SPEA-1040 to further test the examples.  in col G I have quick manual check on what I considered unique or if not unique how many dups.

    In col I you can see the results of this new formula correctly finds 6 truely unique cases.  The other formulas find 8 strictly unique cases and the last column M found 7 because of the 2 cases with partial matches each being 0.5 adding to 1 more.

    In the last example you can see the partial matches are "Black Screen", "No Boot" , and "No Boot - Black Screen" and because the partial matches aren't 1 for 1 (i.e. "Black Screen" and "No Boot" both match "No Boot - Black Screen" but not each other while "No Boot - Black Screen" finds a partial match to each) you get a fraction added.

  • IlirU's avatar
    IlirU
    Brass Contributor

     

    Apply this formula:

     

    =COUNTA(UNIQUE(TRIMRANGE(CHOOSECOLS(A2:F10, {3,6})))) / 2

     

    HTH

     

    Regards,

    IlirU

  • If i correctly understand 7 is the result for the sample dataset.

    =SUMPRODUCT(1/COUNTIFS(C2:C9,C2:C9,F2:F9,F2:F9))

     

Resources