Forum Discussion

BF_UK's avatar
BF_UK
Copper Contributor
Mar 15, 2023
Solved

Top n most common entries in an array of numbers and how often they occur

I have a 3 column x 1000 row array that contains a variety of numbers in one or more of the columns i.e. for any given row it might have a different entry in each of the 3 columns or it may simply have an entry in 1 column with the other 2 columns being blank. The numbers range from 1111 to 9999 and can be any mix. I want to extract the 10 numbers that occur the most often irrespective of what row or column they're in, put them in a separate column showing the 10 numbers and showing the number of times each number occurs and then sort descending from most common to least common. Sounds simple but I can't work out how to do it. Can anyone help please? Thanks (Office 365)

  • BF_UK 

    =TAKE(SORT(HSTACK(UNIQUE(TOCOL(B1:D1000)),BYROW(UNIQUE(TOCOL(B1:D1000)),LAMBDA(row,COUNTIF(B1:D1000,row)))),2,-1),10)

    You can try this formula which works in my sheet.

15 Replies

  • JosWoolley's avatar
    JosWoolley
    Iron Contributor

    BF_UK 

     

    This will also discount any null strings ("") present within the range:

     

    =LET(
        ζ,B1:D1000,
        TAKE(SORT(IF({0,1},FREQUENCY(ζ,0+(0&ζ)),TOCOL(ζ)),2,-1),10)
    )
    • BF_UK's avatar
      BF_UK
      Copper Contributor
      Thank you very much for posting a solution to my problem. In the end I used the solution given by  because they showed a table and a result. No other reason than that, It worked perfectly. I could not have done this on my own so thank you so much for answering my question. It's fantastic to see the Excel community in action. Thank you once again. 
    • PeterBartholomew1's avatar
      PeterBartholomew1
      Silver Contributor
      Nice. You make better use of FREQUENCY than I did. My sorting and removing duplicates for the bins was unnecessary. Since my array was generated using RANDARRAY blanks were not an issue.
      • JosWoolley's avatar
        JosWoolley
        Iron Contributor

        PeterBartholomew1 

         

        The OP did state "for any given row it might have a different entry in each of the 3 columns or it may simply have an entry in 1 column with the other 2 columns being blank". I just assumed that some or all of those "blanks" could be null strings.

  • BF_UK 

    My solution is similar to the others

    = LET(
        occurrences, TOCOL(COUNTIFS(array#,array#)),
        valuesList,  TOCOL(array#),
        sorted,      SORT(HSTACK(valuesList, occurrences),{2,1},-1),
        TAKE(UNIQUE(sorted),10)
      )

    The most significant differences are that I applied COUNTIFS to the 2D array whilst is was still a Range and only then reduced the results to a column.  I used LET to avoid excessive function nesting but did not place the formula within a LAMBDA despite the fact that I am moving in that direction.

     

    I also noticed the use of FREQUENCY as an alternative to COUNTIFS.

    = LET(
        list,        TOCOL(array#),
        distinct,    SORT(UNIQUE(list)),
        occurrences, DROP(FREQUENCY(list, distinct),-1),
        sorted,      SORT(HSTACK(distinct, occurrences),{2,1},-1),
        TAKE(sorted, 10)
      )

    That also seems to work well.

    • BF_UK's avatar
      BF_UK
      Copper Contributor
      Thank you very much for posting a solution to my problem. In the end I used the solution given by  because they showed a table and a result. No other reason than that, It worked perfectly. I could not have done this on my own so thank you so much for answering my question. It's fantastic to see the Excel community in action. Thank you once again. 
  • Patrick2788's avatar
    Patrick2788
    Silver Contributor

    BF_UK 

     

    My solution:

     

    matrix is the 1000x3 array.

     

    'Vector
    =TOCOL(matrix)
    
    'Total' Lambda - run each of the numbers through a filter to obtain the counts. Stack, sort, and take top 10.
    
    =LAMBDA(a,v,LET(
        filtered, COUNT(FILTER(vector, vector = v)),
        stack, VSTACK(a, HSTACK(v, filtered)),
        TAKE(SORT(stack, 2, -1), 10)
    ))

     

    Sheet level formula:

    =REDUCE({"Number","Count"},UNIQUE(vector),Total)
    • BF_UK's avatar
      BF_UK
      Copper Contributor
      Thank you very much for posting a solution to my problem. In the end I used the solution given by  Quadruple_Pawn because they showed a table and a result. No other reason than that, It worked perfectly. I could not have done this on my own so thank you so much for answering my question. It's fantastic to see the Excel community in action. Thank you once again. 
  • BF_UK 

    =TAKE(SORT(HSTACK(UNIQUE(TOCOL(B1:D1000)),BYROW(UNIQUE(TOCOL(B1:D1000)),LAMBDA(row,COUNTIF(B1:D1000,row)))),2,-1),10)

    You can try this formula which works in my sheet.

  • Hecatonchire's avatar
    Hecatonchire
    Iron Contributor

    Try

    =LET(Liste,SEQUENCE(MAX(A2:C1001),,MIN(A2:C1001)),TAKE(SORT(VSTACK(Liste,FREQUENCY(A2:C1001,Liste)),2,-1),10))

     

    BF_UK 

    • BF_UK's avatar
      BF_UK
      Copper Contributor
      Thank you very much for posting a solution to my problem. In the end I used the solution given by Quadruple_Pawn because they showed a table and a result. No other reason than that, It worked perfectly. I could not have done this on my own so thank you so much for answering my question. It's fantastic to see the Excel community in action. Thank you once again. 

Resources