Forum Discussion
Return of the Corkscrew with Spilled Arrays
- Oct 23, 2024
James_Buist Just for good measure, MMULT can also get the job done with minimal effort.
Closing Balance (cell G22):
=LET( data, HSTACK(F22:F24,G12:M14+G17:M19), cId, SEQUENCE(,COLUMNS(data)), arr, MMULT(--data,--(DROP(cId,,1)>=TOCOL(cId))), VSTACK(arr,BYCOL(arr,SUM)) )Cheers!
The data structure arguments may be a bit more nuanced than you suggest. Certainly SUMIFS and friends are blisteringly fast running over huge datasets and picking out relevant values for aggregation. On the other hand, if you know you want every 12th value from the outset, having the data wrapped so that every value you require is to be found in column 12 is both fast and open to visual inspection.
Possibly
= BYCOL(crosstab, SUM)isn't as fast as
= SUMIFS(tableValues, tableMonths, CalendarMonths)but it is still pretty effective. What it won't do is process irregular lists.
Further musings ...
I suspect that array methods provide the best fit for sorted lists such as the time-series data arising in financial models. One thing I did recently was to define a Lambda helper function that sums and array along reverse diagonals.
The operations were to 1. Form the cartesian product of the two arrays (CAPEX and spread); 2. Pad the right-hand side with blanks to avoid the rightmost terms wrapping to the start; 3. wrap row by row to a grid one cell narrower than the source data, to line up the fields to be summed; 4. sum by column.
Even more complicated is FIFO, where I stack the cumulative inflows and outflows and sort the combined dataset. Each batch with its associated inflow, outflow and price is obtained as a difference and can be grouped by outflow.
- James_BuistOct 25, 2024Brass ContributorI'll need to take a look at this when I get a moment. When I first started looking at Dynamic Arrays, I was helping a friend out on his model and he needed it to be dynamic rolling out up to 120 months with a set of staggered inputs. ie outlets coming online over a period of 24 months but a revenue growth model for the outlets based on their starting period. ie Month 1 has 10 new outlets, month 2 10 more but month 3 would have 20 new ones. But in Month 2, the revenue model starts as month 1 and in month 3 also. So I needed this staggered approach and I couldn't figure out how to do this. Then I came upon one of your posts on depreciation which was just what I needed to solve that conundrum. SO your depreciation examples can do far more than just depreciation!!
- PeterBartholomew1Oct 25, 2024Silver Contributor
Gradually the whole basis of my modelling is changing. For example, the 360 periods of the model, illustrated , were created as one dynamic array formula in cell H108
using a formula
=LET( balances, SCAN2Hλ({0; 0}, SeriesData, Modelλ), opening, HSTACK({0; 0}, DROP(balances, , -1)), result, Modelλ(opening, SeriesData), result )As I remember it, the key steps were to create the function Modelλ that, given opening balances, calculates results at the next timestep. The SCAN2Hλ helper function project the cash and revolver balances forward in time (other quantities were calculated but only the balances were returned by the function as an array of Lambda functions (thunks). Once all the opening balances were known (SCAN actually returns closing balances but a simple step to insert the initial balances and trim the final values convert to opening balances) the Modelλ function is called again but, this time with entire arrays of 360 terms to recover all the quantities shown.
The corkscrews are just a presentational devices that have little to do with the calculation. As long as you remains within the single function environment, you have access to all the intermediate calculations by name, and the display, including blank lines between blocks, is created by stacking results that you determine to be of interest to the user. That does, however, limit the output to a single worksheet.
- James_BuistOct 25, 2024Brass ContributorThat is pretty amazing. But I'm moving in smaller steps and taking something that is totally old school and moving it to be fully dynamic but still in parts. I don't think many users are ready for a single sheet formula. They always end up popping bits in between however much one tries to prevent it and of course anything typed in the middle will cause a spill error.
I'm also trying to get bankers to leave vLookup behind forever - never used it myself - horrible function - and start using simple spill functions just for columnar data so there are no more fixed ranged of 10K rows in case the data happens to occupy that. Getting people to use dynamic ranges is a big step in itself!! I show them the performance increases and ease of application. It will take a while..