Forum Discussion
Generating a 2D array from sumifs on the columns of another 2D array
- Jul 28, 2024
James_Buist As a variant, here's a couple of SUMIFS options:
=LET( arr, LAMBDA(n, CHOOSE(n, K10#, SEQUENCE(, COLUMNS(M2#)))), MAP(arr({1}), arr({2}), LAMBDA(r,c, SUMIFS(INDEX(M2#,, c), K2#, r))) )
-OR-
=DROP(REDUCE(0, SEQUENCE(COLUMNS(M2#)), LAMBDA(p,c, HSTACK(p, SUMIFS(INDEX(M2#,, c), K2#, K10#)))),, 1)
The latter should be more efficient because there are fewer iterations, but you probably won't notice a difference on a relatively small dataset.
Having said that, I'd be more interested to see how the actual data source is organized. Obviously, cells K2 and M2 contain formulas that reference another dataset, because you're using the spilled range operator (#). If the data source was arranged vertically in a table with columns for "Division", "Month" and "Amount", you could easily summarize the data using a single SUMIFS formula or a pivot table.
Please see the attached example file, if needed...
James_Buist As a variant, here's a couple of SUMIFS options:
=LET(
arr, LAMBDA(n, CHOOSE(n, K10#, SEQUENCE(, COLUMNS(M2#)))),
MAP(arr({1}), arr({2}), LAMBDA(r,c, SUMIFS(INDEX(M2#,, c), K2#, r)))
)
-OR-
=DROP(REDUCE(0, SEQUENCE(COLUMNS(M2#)), LAMBDA(p,c, HSTACK(p, SUMIFS(INDEX(M2#,, c), K2#, K10#)))),, 1)
The latter should be more efficient because there are fewer iterations, but you probably won't notice a difference on a relatively small dataset.
Having said that, I'd be more interested to see how the actual data source is organized. Obviously, cells K2 and M2 contain formulas that reference another dataset, because you're using the spilled range operator (#). If the data source was arranged vertically in a table with columns for "Division", "Month" and "Amount", you could easily summarize the data using a single SUMIFS formula or a pivot table.
Please see the attached example file, if needed...
Many thanks for these. Brilliant
Firstly, the dataset is just a small subset. Should have mentioned that in the post. Actually there will be up to 200 columns and 10 rows so around 2K cells in all, reading off another dataset of up to 200 columns and maybe 100 rows. The main data set to sumif by column.
I actually prefer the first answer just because I do find LET really good and easy to break down what is happening – I change the outputs to subparts to analyse. Also I already tried this solution with MAP but failed so was sort of on the right track with one approach. Love the clever use of Choose as a function within.
I like the REDUCE solution too but I haven’t totally got my head around the more complex Scan and Reduce use cases – but need to. I had also tried a solution with REDUCE but couldn’t get it to work. But I think it is harder to break down and analyse than the LET & MAP option and I need other third parties to be able to follow this. Thus I feel that explaining the Let solution is simpler. But I will also try with a much larger dataset and see what performance differences there are.
Finally, thanks also for the SUMIFS solution. I had also tried expanding out the array as an option but couldn’t figure out that either. The use of TOCOL with the If({1}) is really cool as it coerces the small array into the larger one. I couldn’t figure out how to do this. Again, I could wrap this in a LET function to do it all in one though would have to use SUMPRODUCT instead of SUMIFS as it would be reading from a DA rather than a range
Sorted me out and was also very educational
Many thanks