Forum Discussion
Set theory operations with dynamic arrays
I'm two years late to the party, but in case anyone is still watching, I think there's a much easier way to do this, and it does support n-ary union. I assume that sets are already unique, but I don't expect them to be sorted.
Union is the simplest:
=LET(A, B4:B11, B, D4:D8, UNIQUE(VSTACK(A, B)))
Since VSTACK takes multiple arguments, so does this union operation.
Symmetric Difference is next-easiest.
=LET(A, B4:B11, B, D4:D8, UNIQUE(VSTACK(A, B), ,1))
The "really unique" option turns UNIQUE from a natural union to a natural symmetric difference. Note: The multi-way symmetric difference isn't (in my view) terribly useful, and this function does not produce that. This operation seems to be called the "unique union" by at least some people. Anyway, it's very handy.
Set subtraction is a bit more clever. To get A-B
=LET(A, B4:B11, B, D4:D8, UNIQUE(VSTACK(A, B, B), ,1))
This works because doubling B guarantees that none of B will be in the result, but elements of A not in B are unaffected. Or, having a set twice in the unique union obliterates all of its elements, so what's left is the set subtraction. To subtract more sets, just chain them on the end, two at a time. E.g. A, B, B, C, C.
Intersection is the harder one, but it's not that hard.
=LET(A, B4:B11, B, D4:D8, UNIQUE(VSTACK(UNIQUE(VSTACK(A, B)), UNIQUE(VSTACK(A, B), ,1)), ,1))
It simply does a symmetric difference between the union of two sets and the symmetric difference of the two. I keep thinking there's a way to simplify this, but I'm not seeing it. Unfortunately, this doesn't give the n-ary intersection: It just gives elements that appear in at least two sets. Here's an n-ary intersection that uses GROUPBY. You basically dump all the sets together and let groupby count the unique elements.
=LET(n, 3, A, B4:B11, B, C4:C8, C, D4:D11, stack,VSTACK(A, B, C),
set_cnt, GROUPBY(stack, SEQUENCE(ROWS(stack), 1, 0), COUNT, 0, 0),
set, CHOOSECOLS(set_cnt, 1), counts, CHOOSECOLS(set_cnt,2),
x_cnt, FILTER(set_cnt, counts=n), CHOOSECOLS(x_cnt, 1))
This is so ugly I keep thinking there must be a better way, but I'm not seeing it. It's also annoying to have to pass in 3, but there's no way to tell how many sets there were after you've merged them together.