SOLVED

Formula to aggregate matrix spilling both rows and columns

Copper Contributor

Hi all,

 

I have an array originating in one cell and spilling both rows and colums:

C5={5,2,1;7,3,4;9,10,8}

 

I also have row and column names that describe its contents:

B5={"A";"A";"B"}

C4={1,2,2}

 

Where both vectors attempt to represent categorical data.

 

What I'd like to get is a new, more aggregated array, where the row and column vectors are grouped by their distinct (unique) elements, and the original array elements added.

 

I could achieve this, for example, by using an old-school array formula (only difference with legacy Excel being I can now enter it without CSE) that I can copy across the output range. See example file.

 

But, I'd like the process to be dynamic so if the original matrix or any of its vectors change, the result will adjust accordingly. I could achieve dynamic row and column totals using the MMULT / SEQUENCE combo, but this is turning out to be very challenging.

Any ideas would be appreciated!

 

Thanks in advance

Diegol

11 Replies

BTW for those interested in getting dynamic row and column totals, see the links below:

https://exceljet.net/formula/get-row-totals

https://exceljet.net/formula/get-column-totals

 

Finally, I got LAMBDA and its helper functions!
Howerver after a few days of practicing, I could not pull out this twofold, group by-like aggregation in a dynamic manner.
I'll keep on trying as the LAMBDA theory continues settling in my head. If anyone gets lucky, please share!

Thanks!

@diegol81 

For such result

image.png

it could be

data = Q!$C$5#;
header = Q!$C$4#;
labels = Q!$B$5#;

uHeader = UNIQUE(header, 1);
uLabels = UNIQUE(labels);

rws = ROWS(ulabels) + 1;
clmns = COLUMNS(uHeader) + 1;

sumData = LAMBDA(lbl, hdr,
    SUM(
        FILTER(
            FILTER(data, (labels = lbl)),
            header = hdr
        )
    )
);

result=
MAKEARRAY(
    rws,
    clmns,
    LAMBDA(r, c,
        IF(
            r = 1,
            IF(c = 1, "", INDEX(uHeader, c - 1)),
            IF(
                c = 1,
                INDEX(uLabels, r - 1),
                sumData(
                    INDEX(uLabels, r - 1),
                    INDEX(uHeader, c - 1)
                )
            )
        )
    )
);

@Sergei Baklan 

 

Many thanks for that! After the first look at your solution, I feel relieved that I wouldn't have been successful, if ever, before several days of goes at it.

 

Your solution is almost exactly what I was looking for. I'd like the labels to be produced by UNIQUEs on ungrouped data, so the formula should only generate the inner {12,10;9,18}. After such a robust starting point I hope I can adjust it myself.

 

Have a nice day!

best response confirmed by Hans Vogelaar (MVP)
Solution

@diegol81 , you are welcome.

 

For data only it's simpler

uHeader = UNIQUE(header, 1);
uLabels = UNIQUE(labels);

onlyData=
MAKEARRAY( ROWS(ulabels), COLUMNS(uHeader),
    LAMBDA( r, c,
        SUM(
            FILTER(
                FILTER(
                    data,
                    (labels=INDEX(uLabels, r))  ),
                header=INDEX(uHeader, c)
            )
        )
    )
);

@Sergei Baklan 

That's just perfect.

Thank you!!!!

 

Regards,

Diego

@Sergei Baklan

 

I geared the solution a bit more towards the shape I want it to have. I ended up with a single name definition, the lambda itself (actually a lambda called within another lambda), that acts a formula (SpillSum as I named it) with 3 arguments: Data, Labels, Headers.

 

The intention is that the formula can be used within any workbook, any number of times, without the need of defining each of the components as a name time and again. Only requisite is to define SpillSum once in each workbook.

 

As you'll see, it very much sticks to the general algorithm you designed. Thanks again!

@diegol81 

IMHO, that's no straightforward solution. As variant you may use AFE Announcing LAMBDAs to Production and Advanced Formula Environment, A Microsoft Garage Project, collect your functions in hidden sheet of some file and use it as template. 

@Sergei Baklan 

 

Thanks for that. Unfortunately the new AFE addin (actually, all the addins accessible via the Insert | Addins | Get addins dialog) download and install seems to have been disabled globally in my company. Hopefully I'll get the admins to enable it.

 

Back to my last post: shortly after positing I realised my solution did not fully resemble the functional structure I was seeking, which is akin to that of a SUMIFS: a data array to add-up; a labels vector to evaluate; a labels condition vector; a (transposed) headers vector to evaluate; a (transposed) headers condition vector. I was missing the two condition arguments, which impaired the formula's functionality.

 

As a result of this review, the LET() call was no longer needed and I renamed the formula SpillSumIfs. Here it goes, hopefully for good.

@diegol81 

That's good, thanks for sharing

1 best response

Accepted Solutions
best response confirmed by Hans Vogelaar (MVP)
Solution

@diegol81 , you are welcome.

 

For data only it's simpler

uHeader = UNIQUE(header, 1);
uLabels = UNIQUE(labels);

onlyData=
MAKEARRAY( ROWS(ulabels), COLUMNS(uHeader),
    LAMBDA( r, c,
        SUM(
            FILTER(
                FILTER(
                    data,
                    (labels=INDEX(uLabels, r))  ),
                header=INDEX(uHeader, c)
            )
        )
    )
);

View solution in original post