Forum Discussion
Help with MMULT for a range containing 100,000+ values
Sorry, the HSTACK was little more than cosmetic tidying up. This is the same function stripped right back to a relative reference.
= SUM(XLOOKUP(@distinct,letter,number,,, 1):XLOOKUP(@distinct,letter,number,,,-1))
Is there a way to implement the solution without MAP / LAMBDA.
- PeterBartholomew1Jun 02, 2022Silver Contributor
So you still have something to look forward to! I couldn't wait, so put my semi-annual update aside and licenced a personal version (insider beta) in order to get to grips with the tumultuous changes.
If you are prepared to accept relative referencing and copy-fill-down then the formula
= SUM( XLOOKUP(@distinct,letter,number,,, 1) : XLOOKUP(@distinct,letter,number,,,-1) )will do the same job (green header).
- dipankar_deJun 02, 2022Copper ContributorHello Peter, I am not really keen to implement copy-fill-down formulas. Reason being sheer size of everchanging data which might run way past the copy-filled formula range.
Since this tool will be run by junior / inexperienced resources, even if I incorporate certain checks to validate correctness / completeness of data, they can very well overlook and ignore such checks and send out an incorrect utilization report. Hence, summarily trying to avoid copy-fill-down formulas. - SergeiBaklanJun 02, 2022Diamond Contributor
It all depends on concrete data. I assume we shall sumif 100000 range with > 7500 criteria elements. On my relatively slow PC
=SUMIFS( <100000 rows sum range>, <100000 rows names range>, <10000 rows unique names range> )
took about 15 sec.
I don't think that any formula with dynamic arrays, in general any other formula, will be working faster.
To compare, refresh of Power Query aggregation for the same dataset takes about 4 sec and 2-3 sec more if aggregate 8 columns at once.
- dipankar_deJun 02, 2022Copper ContributorHello Sergei, the actual count I have is 182,052 of which 94,318 are unique which is ~52% of the total records/rows. This is just representative data and is expected to increase over time. the processing is happening on a Dell Latitude 3400 (Intel i5 processor with 16GB RAM) laptop and my experience with SUMIFS on this laptop on the aforesaid data is rather slow compared to dynamic array formulas.
I am trying to avoid PowerQuery since it will require refresh every time data gets changed. This tool will be run periodically by junior / inexperienced resources who might miss out to refresh after data change and in turn generate an incorrect report / dashboard.
The tool being currently developed is for analyzing and reporting weekly Time-Sheet data (resource utilization) to the leadership and is expected to get extended to include invoicing to clients for T&M contracts which will comprise reconciliation of Time-Sheet data with client captured utilization information.
The whole process will contain a series of Excel workbooks, each performing a specific task, in a predefined sequence.