Forum Discussion

BF_UK's avatar
BF_UK
Copper Contributor
Mar 15, 2023
Solved

Top n most common entries in an array of numbers and how often they occur

I have a 3 column x 1000 row array that contains a variety of numbers in one or more of the columns i.e. for any given row it might have a different entry in each of the 3 columns or it may simply have an entry in 1 column with the other 2 columns being blank. The numbers range from 1111 to 9999 and can be any mix. I want to extract the 10 numbers that occur the most often irrespective of what row or column they're in, put them in a separate column showing the 10 numbers and showing the number of times each number occurs and then sort descending from most common to least common. Sounds simple but I can't work out how to do it. Can anyone help please? Thanks (Office 365)

  • BF_UK 

    =TAKE(SORT(HSTACK(UNIQUE(TOCOL(B1:D1000)),BYROW(UNIQUE(TOCOL(B1:D1000)),LAMBDA(row,COUNTIF(B1:D1000,row)))),2,-1),10)

    You can try this formula which works in my sheet.

15 Replies

  • JosWoolley's avatar
    JosWoolley
    Iron Contributor

    BF_UK 

     

    This will also discount any null strings ("") present within the range:

     

    =LET(
        ζ,B1:D1000,
        TAKE(SORT(IF({0,1},FREQUENCY(ζ,0+(0&ζ)),TOCOL(ζ)),2,-1),10)
    )
    • BF_UK's avatar
      BF_UK
      Copper Contributor
      Thank you very much for posting a solution to my problem. In the end I used the solution given by  because they showed a table and a result. No other reason than that, It worked perfectly. I could not have done this on my own so thank you so much for answering my question. It's fantastic to see the Excel community in action. Thank you once again. 
    • PeterBartholomew1's avatar
      PeterBartholomew1
      Silver Contributor
      Nice. You make better use of FREQUENCY than I did. My sorting and removing duplicates for the bins was unnecessary. Since my array was generated using RANDARRAY blanks were not an issue.
      • JosWoolley's avatar
        JosWoolley
        Iron Contributor

        PeterBartholomew1 

         

        The OP did state "for any given row it might have a different entry in each of the 3 columns or it may simply have an entry in 1 column with the other 2 columns being blank". I just assumed that some or all of those "blanks" could be null strings.

  • BF_UK 

    My solution is similar to the others

    = LET(
        occurrences, TOCOL(COUNTIFS(array#,array#)),
        valuesList,  TOCOL(array#),
        sorted,      SORT(HSTACK(valuesList, occurrences),{2,1},-1),
        TAKE(UNIQUE(sorted),10)
      )

    The most significant differences are that I applied COUNTIFS to the 2D array whilst is was still a Range and only then reduced the results to a column.  I used LET to avoid excessive function nesting but did not place the formula within a LAMBDA despite the fact that I am moving in that direction.

     

    I also noticed the use of FREQUENCY as an alternative to COUNTIFS.

    = LET(
        list,        TOCOL(array#),
        distinct,    SORT(UNIQUE(list)),
        occurrences, DROP(FREQUENCY(list, distinct),-1),
        sorted,      SORT(HSTACK(distinct, occurrences),{2,1},-1),
        TAKE(sorted, 10)
      )

    That also seems to work well.

    • BF_UK's avatar
      BF_UK
      Copper Contributor
      Thank you very much for posting a solution to my problem. In the end I used the solution given by  because they showed a table and a result. No other reason than that, It worked perfectly. I could not have done this on my own so thank you so much for answering my question. It's fantastic to see the Excel community in action. Thank you once again. 
  • Patrick2788's avatar
    Patrick2788
    Silver Contributor

    BF_UK 

     

    My solution:

     

    matrix is the 1000x3 array.

     

    'Vector
    =TOCOL(matrix)
    
    'Total' Lambda - run each of the numbers through a filter to obtain the counts. Stack, sort, and take top 10.
    
    =LAMBDA(a,v,LET(
        filtered, COUNT(FILTER(vector, vector = v)),
        stack, VSTACK(a, HSTACK(v, filtered)),
        TAKE(SORT(stack, 2, -1), 10)
    ))

     

    Sheet level formula:

    =REDUCE({"Number","Count"},UNIQUE(vector),Total)
    • BF_UK's avatar
      BF_UK
      Copper Contributor
      Thank you very much for posting a solution to my problem. In the end I used the solution given by  Quadruple_Pawn because they showed a table and a result. No other reason than that, It worked perfectly. I could not have done this on my own so thank you so much for answering my question. It's fantastic to see the Excel community in action. Thank you once again. 
  • BF_UK 

    =TAKE(SORT(HSTACK(UNIQUE(TOCOL(B1:D1000)),BYROW(UNIQUE(TOCOL(B1:D1000)),LAMBDA(row,COUNTIF(B1:D1000,row)))),2,-1),10)

    You can try this formula which works in my sheet.

  • Hecatonchire's avatar
    Hecatonchire
    Iron Contributor

    Try

    =LET(Liste,SEQUENCE(MAX(A2:C1001),,MIN(A2:C1001)),TAKE(SORT(VSTACK(Liste,FREQUENCY(A2:C1001,Liste)),2,-1),10))

     

    BF_UK 

    • BF_UK's avatar
      BF_UK
      Copper Contributor
      Thank you very much for posting a solution to my problem. In the end I used the solution given by Quadruple_Pawn because they showed a table and a result. No other reason than that, It worked perfectly. I could not have done this on my own so thank you so much for answering my question. It's fantastic to see the Excel community in action. Thank you once again.