Mar 15 2023 09:23 AM
I have a 3 column x 1000 row array that contains a variety of numbers in one or more of the columns i.e. for any given row it might have a different entry in each of the 3 columns or it may simply have an entry in 1 column with the other 2 columns being blank. The numbers range from 1111 to 9999 and can be any mix. I want to extract the 10 numbers that occur the most often irrespective of what row or column they're in, put them in a separate column showing the 10 numbers and showing the number of times each number occurs and then sort descending from most common to least common. Sounds simple but I can't work out how to do it. Can anyone help please? Thanks (Office 365)
Mar 15 2023 09:50 AM
Try
=LET(Liste,SEQUENCE(MAX(A2:C1001),,MIN(A2:C1001)),TAKE(SORT(VSTACK(Liste,FREQUENCY(A2:C1001,Liste)),2,-1),10))
Mar 15 2023 09:51 AM
Solution=TAKE(SORT(HSTACK(UNIQUE(TOCOL(B1:D1000)),BYROW(UNIQUE(TOCOL(B1:D1000)),LAMBDA(row,COUNTIF(B1:D1000,row)))),2,-1),10)
You can try this formula which works in my sheet.
Mar 15 2023 10:19 AM
My solution:
matrix is the 1000x3 array.
'Vector
=TOCOL(matrix)
'Total' Lambda - run each of the numbers through a filter to obtain the counts. Stack, sort, and take top 10.
=LAMBDA(a,v,LET(
filtered, COUNT(FILTER(vector, vector = v)),
stack, VSTACK(a, HSTACK(v, filtered)),
TAKE(SORT(stack, 2, -1), 10)
))
Sheet level formula:
=REDUCE({"Number","Count"},UNIQUE(vector),Total)
Mar 15 2023 01:09 PM
My solution is similar to the others
= LET(
occurrences, TOCOL(COUNTIFS(array#,array#)),
valuesList, TOCOL(array#),
sorted, SORT(HSTACK(valuesList, occurrences),{2,1},-1),
TAKE(UNIQUE(sorted),10)
)
The most significant differences are that I applied COUNTIFS to the 2D array whilst is was still a Range and only then reduced the results to a column. I used LET to avoid excessive function nesting but did not place the formula within a LAMBDA despite the fact that I am moving in that direction.
I also noticed the use of FREQUENCY as an alternative to COUNTIFS.
= LET(
list, TOCOL(array#),
distinct, SORT(UNIQUE(list)),
occurrences, DROP(FREQUENCY(list, distinct),-1),
sorted, SORT(HSTACK(distinct, occurrences),{2,1},-1),
TAKE(sorted, 10)
)
That also seems to work well.
Mar 15 2023 01:17 PM - edited Mar 15 2023 01:19 PM
This will also discount any null strings ("") present within the range:
=LET(
ζ,B1:D1000,
TAKE(SORT(IF({0,1},FREQUENCY(ζ,0+(0&ζ)),TOCOL(ζ)),2,-1),10)
)
Mar 15 2023 03:08 PM
Mar 15 2023 10:46 PM
The OP did state "for any given row it might have a different entry in each of the 3 columns or it may simply have an entry in 1 column with the other 2 columns being blank". I just assumed that some or all of those "blanks" could be null strings.
Mar 16 2023 02:46 AM
Mar 16 2023 01:52 PM
Another possibility is to use the TOCOL function to remove the blanks
= LET(
ζ, TOCOL(data,1),
TAKE(SORT(HSTACK(ζ,FREQUENCY(ζ,ζ)),2,-1),10)
)
Mar 16 2023 10:59 PM
Unfortunately that only works with 'genuine' blanks. Any null strings present within the range will cause the second column in the output to contain one or more #N/A returns.
Regards
Mar 19 2023 02:18 PM
Mar 19 2023 04:07 PM
Mar 20 2023 01:49 PM
Mar 20 2023 01:53 PM - edited Mar 20 2023 01:54 PM
Mar 21 2023 04:19 PM