Jul 27 2024 04:15 PM
I have a 2D array of costs by month and want a sumif equivalent based on another dynamic array of Divisions see below. i.e. Jan Div 1 is in M2 so I can refer to this as M2# to get the full table Jan to May for all the data. The divisions for this table are in K2 so K2# gets those
I also have the unique divisions Div 1 to 3 in an array in K10 so K10# gets that. I want to build dynamically the totals by division table. I can easily do this row by row using
=BYCOL(M$2#, LAMBDA(c,SUMPRODUCT(($K$2#=$K10)*c))) but this just gives me the first row. I can copy down for the rest. But I want them in a single 2D array and just cannot get my head around how to do this. I tried embedding in a ToRow on list of divisions. I tried dynamic VStacking but can't figure out how to get this to work.
Each table will be fully dynamic so I need the output to be fully dynamic too
I know I can't use Sumifs but opted for sumproduct over filter. I don't mind what is used ultimately but need the simplest possible solution that others later may understand. Its always a trade off between getting the coolest solution and one that is most easy to follow
So many wiser people than me out there and this would save the day for me. Many thanks
Jul 27 2024 07:19 PM
@James_Buist You can do it by MAKEARRAY() or REDUCE() function. You can do all calculation by a single cell formula. Try the following formula. You can also download the attached file.
=LET(div,UNIQUE(A2:A7),
result,MAKEARRAY(ROWS(div),COLUMNS(C1:G1),LAMBDA(r,c,SUM(FILTER(FILTER(C2:G7,A2:A7=INDEX(div,r)),C1:G1=INDEX(C1:G1,1,c))))),
VSTACK(HSTACK(A1,C1:G1),HSTACK(div,result)))
Jul 27 2024 10:01 PM
Solution@James_Buist As a variant, here's a couple of SUMIFS options:
=LET(
arr, LAMBDA(n, CHOOSE(n, K10#, SEQUENCE(, COLUMNS(M2#)))),
MAP(arr({1}), arr({2}), LAMBDA(r,c, SUMIFS(INDEX(M2#,, c), K2#, r)))
)
-OR-
=DROP(REDUCE(0, SEQUENCE(COLUMNS(M2#)), LAMBDA(p,c, HSTACK(p, SUMIFS(INDEX(M2#,, c), K2#, K10#)))),, 1)
The latter should be more efficient because there are fewer iterations, but you probably won't notice a difference on a relatively small dataset.
Having said that, I'd be more interested to see how the actual data source is organized. Obviously, cells K2 and M2 contain formulas that reference another dataset, because you're using the spilled range operator (#). If the data source was arranged vertically in a table with columns for "Division", "Month" and "Amount", you could easily summarize the data using a single SUMIFS formula or a pivot table.
Please see the attached example file, if needed...
Jul 28 2024 02:46 AM
Jul 28 2024 03:15 AM
Thanks for your solution
As I was not looking for a full table and needed to drive it off the list of Divisions and all inputs are dynamic arrays – I had to tweak the solution. My layout means that the list of criteria can’t be generated dynamically from the list at A2# as there may be divisions which have no costs but still need to show. But easy to change. Other than that, it was a good solution and would work well. However, with this solution I also need to include the headers in the function – which is fine – but another input.
For anyone else reading, here is the tweaked result
=LET(div,A10#, MAKEARRAY(ROWS(div),COLUMNS(C2#),LAMBDA(r,c,SUM(FILTER(FILTER(C2#,A2#=INDEX(div,r)),C1#=INDEX(C1#,,c))))))
Many thanks
PS. It may be faster than the other solution I chose.
I will do some performance testing on all three with a much larger dataset then the actual one which is around 200 colums x10 rows. As you stated, I was looking to do this in a single cell which your solution did
Jul 28 2024 04:10 AM
Jul 28 2024 10:05 PM
@James_Buist MMULT is also a possibility:
=MMULT(--(K10# = TOROW(K2#)), M2#)
This is both the shortest and most efficient method for your layout. However, since you "need the simplest possible solution that others later may understand", it may or may not be the way to go, as it's not the easiest method to understand and/or modified to meet new criteria.
Regarding the MAP / CHOOSE method, this was offered as a relatively efficient alternative to the MAKEARRAY / INDEX method, but not necessarily the most efficient method overall. As with most situations in Excel, there are many different ways to get the job done. Ultimately, it comes down to the method you prefer, while maintaining efficiency relative to the size of the dataset you're working with.
To demonstrate the difference in calculation speed between MAP / CHOOSE and MAKEARRAY / INDEX, try the following comparison test:
=LET(
arr, LAMBDA(n, CHOOSE(n, SEQUENCE(1000,,,2), SEQUENCE(,1000,,5))),
MAP(arr({1}), arr({2}), LAMBDA(r,c, r+c))
)
\\compared to:
=LET(
rs, SEQUENCE(1000,,,2),
cs, SEQUENCE(,1000,,5),
MAKEARRAY(1000, 1000, LAMBDA(r,c, INDEX(rs,r)+INDEX(cs,c)))
)
On my system, MAP took only 1 second to return the results, whereas MAKEARRAY took almost 30.
(Obviously =SEQUENCE(1000,,,2)+SEQUENCE(,1000,,5) is all that's needed here, but that would defeat the purpose of the test, lol.)
Cheers!
Jul 27 2024 10:01 PM
Solution@James_Buist As a variant, here's a couple of SUMIFS options:
=LET(
arr, LAMBDA(n, CHOOSE(n, K10#, SEQUENCE(, COLUMNS(M2#)))),
MAP(arr({1}), arr({2}), LAMBDA(r,c, SUMIFS(INDEX(M2#,, c), K2#, r)))
)
-OR-
=DROP(REDUCE(0, SEQUENCE(COLUMNS(M2#)), LAMBDA(p,c, HSTACK(p, SUMIFS(INDEX(M2#,, c), K2#, K10#)))),, 1)
The latter should be more efficient because there are fewer iterations, but you probably won't notice a difference on a relatively small dataset.
Having said that, I'd be more interested to see how the actual data source is organized. Obviously, cells K2 and M2 contain formulas that reference another dataset, because you're using the spilled range operator (#). If the data source was arranged vertically in a table with columns for "Division", "Month" and "Amount", you could easily summarize the data using a single SUMIFS formula or a pivot table.
Please see the attached example file, if needed...