Forum Discussion
Top n most common entries in an array of numbers and how often they occur
I have a 3 column x 1000 row array that contains a variety of numbers in one or more of the columns i.e. for any given row it might have a different entry in each of the 3 columns or it may simply have an entry in 1 column with the other 2 columns being blank. The numbers range from 1111 to 9999 and can be any mix. I want to extract the 10 numbers that occur the most often irrespective of what row or column they're in, put them in a separate column showing the 10 numbers and showing the number of times each number occurs and then sort descending from most common to least common. Sounds simple but I can't work out how to do it. Can anyone help please? Thanks (Office 365)
=TAKE(SORT(HSTACK(UNIQUE(TOCOL(B1:D1000)),BYROW(UNIQUE(TOCOL(B1:D1000)),LAMBDA(row,COUNTIF(B1:D1000,row)))),2,-1),10)
You can try this formula which works in my sheet.
15 Replies
- JosWoolleyIron Contributor
This will also discount any null strings ("") present within the range:
=LET( ζ,B1:D1000, TAKE(SORT(IF({0,1},FREQUENCY(ζ,0+(0&ζ)),TOCOL(ζ)),2,-1),10) )
- BF_UKCopper ContributorThank you very much for posting a solution to my problem. In the end I used the solution given by because they showed a table and a result. No other reason than that, It worked perfectly. I could not have done this on my own so thank you so much for answering my question. It's fantastic to see the Excel community in action. Thank you once again.
- PeterBartholomew1Silver ContributorNice. You make better use of FREQUENCY than I did. My sorting and removing duplicates for the bins was unnecessary. Since my array was generated using RANDARRAY blanks were not an issue.
- JosWoolleyIron Contributor
The OP did state "for any given row it might have a different entry in each of the 3 columns or it may simply have an entry in 1 column with the other 2 columns being blank". I just assumed that some or all of those "blanks" could be null strings.
- PeterBartholomew1Silver Contributor
My solution is similar to the others
= LET( occurrences, TOCOL(COUNTIFS(array#,array#)), valuesList, TOCOL(array#), sorted, SORT(HSTACK(valuesList, occurrences),{2,1},-1), TAKE(UNIQUE(sorted),10) )
The most significant differences are that I applied COUNTIFS to the 2D array whilst is was still a Range and only then reduced the results to a column. I used LET to avoid excessive function nesting but did not place the formula within a LAMBDA despite the fact that I am moving in that direction.
I also noticed the use of FREQUENCY as an alternative to COUNTIFS.
= LET( list, TOCOL(array#), distinct, SORT(UNIQUE(list)), occurrences, DROP(FREQUENCY(list, distinct),-1), sorted, SORT(HSTACK(distinct, occurrences),{2,1},-1), TAKE(sorted, 10) )
That also seems to work well.
- BF_UKCopper ContributorThank you very much for posting a solution to my problem. In the end I used the solution given by because they showed a table and a result. No other reason than that, It worked perfectly. I could not have done this on my own so thank you so much for answering my question. It's fantastic to see the Excel community in action. Thank you once again.
- Patrick2788Silver Contributor
My solution:
matrix is the 1000x3 array.
'Vector =TOCOL(matrix) 'Total' Lambda - run each of the numbers through a filter to obtain the counts. Stack, sort, and take top 10. =LAMBDA(a,v,LET( filtered, COUNT(FILTER(vector, vector = v)), stack, VSTACK(a, HSTACK(v, filtered)), TAKE(SORT(stack, 2, -1), 10) ))
Sheet level formula:
=REDUCE({"Number","Count"},UNIQUE(vector),Total)
- BF_UKCopper ContributorThank you very much for posting a solution to my problem. In the end I used the solution given by Quadruple_Pawn because they showed a table and a result. No other reason than that, It worked perfectly. I could not have done this on my own so thank you so much for answering my question. It's fantastic to see the Excel community in action. Thank you once again.
- Patrick2788Silver Contributor
- OliverScheurichGold Contributor
=TAKE(SORT(HSTACK(UNIQUE(TOCOL(B1:D1000)),BYROW(UNIQUE(TOCOL(B1:D1000)),LAMBDA(row,COUNTIF(B1:D1000,row)))),2,-1),10)
You can try this formula which works in my sheet.
- HecatonchireIron Contributor
Try
=LET(Liste,SEQUENCE(MAX(A2:C1001),,MIN(A2:C1001)),TAKE(SORT(VSTACK(Liste,FREQUENCY(A2:C1001,Liste)),2,-1),10))
- BF_UKCopper ContributorThank you very much for posting a solution to my problem. In the end I used the solution given by Quadruple_Pawn because they showed a table and a result. No other reason than that, It worked perfectly. I could not have done this on my own so thank you so much for answering my question. It's fantastic to see the Excel community in action. Thank you once again.