Forum Discussion
Help with MMULT for a range containing 100,000+ values
I have been trying to solve a problem for some time now but unable to find an acceptable solution and hence requesting help from you. The attached has a snapshot of what I am trying to achieve. The problem I am facing is when the list goes beyond 100,000 rows, Excel runs out of resource.
The list contains duplicates as well as non-duplicates. I believe, if we restrict MMULT only for the duplicates, the "Out of Resource" error won't show up. I want the formula to do the following:
1. If unique (non-duplicate), simply consider the given value
2. If duplicate, then do MMULT (sum) for only the duplicate at hand and move to the next in sequence.
In the attached, "a", "b" and "d" are duplicated while "c" is unique. Want the formula to consider only "a" for MMULT and then separately for "b" and so on. Doing it together for "a", "b", "c" and "d" gives the "Out of Resource" error. The formula identifies "c" as unique and copies the corresponding value.
Please let me know if further clarification is required.
I wanted to upload the file but unfortunately couldn't find a link to do so.
22 Replies
- PeterBartholomew1Silver Contributor
SUMIFS is very fast in execution and tends to beat array methods. However, provided your data is strictly sorted, it is possible to search for the first and last entries of each contiguous range and sum that.
= LET( distinct, SORT(UNIQUE(letter)), total, MAP(distinct, LAMBDA(chr, LET( firstCell, XLOOKUP(chr,letter,number,,,1), finalCell, XLOOKUP(chr,letter,number,,,-1), SUM(firstCell:finalCell) ) ) ), HSTACK(distinct, total) )I have beaten SUMIFS with formulas of this kind in the past in situations in which each summation is only working off a small proportion of the values.
- dipankar_deCopper ContributorHello Peter, if, instead of one "number" column, we have, say 5 "number" columns, how would the formula change to accommodate summing up the 5 column values independently for each unique "letter"?
- dipankar_deCopper ContributorHello Peter, Many thanks for the help you extended. Unfortunately, I don't have the version that supports LAMBDA, HSTACK and others that are still in Beta. Hence, am limited to functions that are available in the Released O365.
- PeterBartholomew1Silver Contributor
Sorry, the HSTACK was little more than cosmetic tidying up. This is the same function stripped right back to a relative reference.
= SUM(XLOOKUP(@distinct,letter,number,,, 1):XLOOKUP(@distinct,letter,number,,,-1))
- dipankar_deCopper ContributorThe file can be accessed through the following link:
https://1drv.ms/x/s!Aqyz05eCPz3-rTQNZBIvJOYSwdGJ?e=b2onEj- Riny_van_EekelenPlatinum Contributor
dipankar_de Thanks! Not sure what I'm looking at but why not simply use this in C20:
=SUMIF(B2#,UNIQUE(B2#),C2#)- dipankar_deCopper ContributorThe actual list has 100,000+ rows and there are 6 columns that need to be individually summed when a duplicate is encountered.
I have tried SUMIFS but that's taking around 14 minutes to process the whole list. That's the reason I am trying to solve this with MMULT and in doing so since the list is long, MMULT is ending up consuming the whole of the memory since it's generating a 100,000+ x 100,000+ matrix.
If somehow, the matrix can be made limited to the number of duplicates for each element (in my example a, b, c, d), possibly the result will be achieved in less than a minute.
- Riny_van_EekelenPlatinum Contributor
dipankar_de You can share your file via Onedrive, Google Docs, Dropbox or similar.