Forum Discussion

Copper Contributor

Sep 27, 2023

Transposing data related to same ID in row

Hello, I have a table where column A is for ID, column B is the date for measurements in column C. I would like to transpose rows having the same ID to columns where column A would be the ID,...

Silver Contributor

Sep 28, 2023

Patrick2788

Reduce, the new workhorse!

Yes, but ...

Although REDUCE does not suffer from the 'array of arrays' problem that cripples SCAN, MAP, BYROW, BYCOL, it does have performance limitations when used with VSTACK to gather prior results. I have recently found that recursive bisection performs better, despite being even more convoluted in terms of its logic than usual!

Worksheet formula
= LET(
    uID, SORT(UNIQUE(Table1[ID])),
    HSTACK(uID, BMAPλ(uID, Pivotλ))
  )

where the Lambda helper function BMAPλ is given by

BMAPλ
=LET(
    n, ROWS(X),
    Y, IF(
        n > 1,
        LET(
            ℓ, n - QUOTIENT(n, 2),
            X₁, TAKE(X, ℓ),
            X₂, DROP(X, ℓ),
            Y₁, BMAPλ(X₁, Fnλ),
            Y₂, BMAPλ(X₂, Fnλ),
            IFERROR(VSTACK(Y₁, Y₂), "")
        ),
        Fnλ(X)
    ),
    Y
)

and

Pivotλ
= TOROW(FILTER(Table1[[nscore_date]:[nscore]], Table1[ID] = v))

What it does is bisect the list of IDs until only one is left and returns the result associated with that ID. It then evaluates the result from the other ID of the final pair and stacks the result. It then repeats with the next pair of IDs and so on back up the calling tree.

Once one is into 1000s of rows the performance differences can be large. For example, I had to rework one solution because REDUCE was taking over 3½ minutes to evaluate. With BMAPλ this reduced to <1sec.

Bisect 9.28.xlsx20 KB

Patrick2788

Silver Contributor

Sep 28, 2023

PeterBartholomew1

This is certainly a creative workaround! I can see where calculation speed is much better than Reduce/VSTACK. If I'm reading this correctly, you're significantly reducing the amount of times VSTACK is being employed by bisecting the data.

If there are 4 unique IDs like in the example provided, REDUCE/VSTACK is run 4x.

With your workaround, VSTACK is run 2 times?

PeterBartholomew1
Silver Contributor
Sep 28, 2023
Patrick2788
I don't think I am winning in terms of the number of VSTACK operations. Where I gain is on the typical size of the operands. For REDUCE, half the final stack (i.e. ½N) is typical for the first operand. For bisection it is log₂N (after all, half the operations only involve two records). I think it is the amount of memory set aside for the result of each VSTACK operation and the time taken to copy records to the fresh memory that brings everything to a standstill.

I had been pretty much ready to write recursion off as a novelty, replaced by helper functions that are more efficient as well as easier to use. I think this puts a somewhat esoteric form or recursion back into play!