Forum Discussion
tw1211
Sep 27, 2023Copper Contributor
Transposing data related to same ID in row
Hello, I have a table where column A is for ID, column B is the date for measurements in column C. I would like to transpose rows having the same ID to columns where column A would be the ID,...
PeterBartholomew1
Sep 28, 2023Silver Contributor
Reduce, the new workhorse!
Yes, but ...
Although REDUCE does not suffer from the 'array of arrays' problem that cripples SCAN, MAP, BYROW, BYCOL, it does have performance limitations when used with VSTACK to gather prior results. I have recently found that recursive bisection performs better, despite being even more convoluted in terms of its logic than usual!
Worksheet formula
= LET(
uID, SORT(UNIQUE(Table1[ID])),
HSTACK(uID, BMAPλ(uID, Pivotλ))
)
where the Lambda helper function BMAPλ is given by
BMAPλ
=LET(
n, ROWS(X),
Y, IF(
n > 1,
LET(
ℓ, n - QUOTIENT(n, 2),
X₁, TAKE(X, ℓ),
X₂, DROP(X, ℓ),
Y₁, BMAPλ(X₁, Fnλ),
Y₂, BMAPλ(X₂, Fnλ),
IFERROR(VSTACK(Y₁, Y₂), "")
),
Fnλ(X)
),
Y
)
and
Pivotλ
= TOROW(FILTER(Table1[[nscore_date]:[nscore]], Table1[ID] = v))
What it does is bisect the list of IDs until only one is left and returns the result associated with that ID. It then evaluates the result from the other ID of the final pair and stacks the result. It then repeats with the next pair of IDs and so on back up the calling tree.
Once one is into 1000s of rows the performance differences can be large. For example, I had to rework one solution because REDUCE was taking over 3½ minutes to evaluate. With BMAPλ this reduced to <1sec.
Patrick2788
Sep 28, 2023Silver Contributor
This is certainly a creative workaround! I can see where calculation speed is much better than Reduce/VSTACK. If I'm reading this correctly, you're significantly reducing the amount of times VSTACK is being employed by bisecting the data.
If there are 4 unique IDs like in the example provided, REDUCE/VSTACK is run 4x.
With your workaround, VSTACK is run 2 times?
- PeterBartholomew1Sep 28, 2023Silver Contributor
I don't think I am winning in terms of the number of VSTACK operations. Where I gain is on the typical size of the operands. For REDUCE, half the final stack (i.e. ½N) is typical for the first operand. For bisection it is log₂N (after all, half the operations only involve two records). I think it is the amount of memory set aside for the result of each VSTACK operation and the time taken to copy records to the fresh memory that brings everything to a standstill.
I had been pretty much ready to write recursion off as a novelty, replaced by helper functions that are more efficient as well as easier to use. I think this puts a somewhat esoteric form or recursion back into play!