Forum Discussion

dipankar_de's avatar
dipankar_de
Copper Contributor
Jun 02, 2022

Help with MMULT for a range containing 100,000+ values

 I have been trying to solve a problem for some time now but unable to find an acceptable solution and hence requesting help from you. The attached has a snapshot of what I am trying to achieve. The problem I am facing is when the list goes beyond 100,000 rows, Excel runs out of resource.

 

The list contains duplicates as well as non-duplicates. I believe, if we restrict MMULT only for the duplicates, the "Out of Resource" error won't show up. I want the formula to do the following:

1. If unique (non-duplicate), simply consider the given value

2. If duplicate, then do MMULT (sum) for only the duplicate at hand and move to the next in sequence.

 

In the attached, "a", "b" and "d" are duplicated while "c" is unique. Want the formula to consider only "a" for MMULT and then separately for "b" and so on. Doing it together for "a", "b", "c" and "d" gives the "Out of Resource" error. The formula identifies "c" as unique and copies the corresponding value.

 

Please let me know if further clarification is required.

 

I wanted to upload the file but unfortunately couldn't find a link to do so.

22 Replies

  • dipankar_de 

    SUMIFS is very fast in execution and tends to beat array methods.  However, provided your data is strictly sorted, it is possible to search for the first and last entries of each contiguous range and sum that.

    = LET(
          distinct, SORT(UNIQUE(letter)),
             total, MAP(distinct, LAMBDA(chr,
                LET(
                   firstCell, XLOOKUP(chr,letter,number,,,1),
                   finalCell, XLOOKUP(chr,letter,number,,,-1),
                   SUM(firstCell:finalCell)
                )
             )
          ),
          HSTACK(distinct, total)
       )

    I have beaten SUMIFS with formulas of this kind in the past in situations in which each summation is only working off a small proportion of the values.

     

    • dipankar_de's avatar
      dipankar_de
      Copper Contributor
      Hello Peter, if, instead of one "number" column, we have, say 5 "number" columns, how would the formula change to accommodate summing up the 5 column values independently for each unique "letter"?

    • dipankar_de's avatar
      dipankar_de
      Copper Contributor
      Hello Peter, Many thanks for the help you extended. Unfortunately, I don't have the version that supports LAMBDA, HSTACK and others that are still in Beta. Hence, am limited to functions that are available in the Released O365.
      • PeterBartholomew1's avatar
        PeterBartholomew1
        Silver Contributor

        dipankar_de 

        Sorry, the HSTACK was little more than cosmetic tidying up.  This is the same function stripped right back to a relative reference.

        = SUM(XLOOKUP(@distinct,letter,number,,, 1):XLOOKUP(@distinct,letter,number,,,-1))

         

  • dipankar_de's avatar
    dipankar_de
    Copper Contributor
    The file can be accessed through the following link:

    https://1drv.ms/x/s!Aqyz05eCPz3-rTQNZBIvJOYSwdGJ?e=b2onEj
      • dipankar_de's avatar
        dipankar_de
        Copper Contributor
        The actual list has 100,000+ rows and there are 6 columns that need to be individually summed when a duplicate is encountered.

        I have tried SUMIFS but that's taking around 14 minutes to process the whole list. That's the reason I am trying to solve this with MMULT and in doing so since the list is long, MMULT is ending up consuming the whole of the memory since it's generating a 100,000+ x 100,000+ matrix.

        If somehow, the matrix can be made limited to the number of duplicates for each element (in my example a, b, c, d), possibly the result will be achieved in less than a minute.

Resources