Forum Discussion
Return of the Corkscrew with Spilled Arrays
- Oct 23, 2024
James_Buist Just for good measure, MMULT can also get the job done with minimal effort.
Closing Balance (cell G22):
=LET( data, HSTACK(F22:F24,G12:M14+G17:M19), cId, SEQUENCE(,COLUMNS(data)), arr, MMULT(--data,--(DROP(cId,,1)>=TOCOL(cId))), VSTACK(arr,BYCOL(arr,SUM)) )
Cheers!
James_Buist Just for good measure, MMULT can also get the job done with minimal effort.
Closing Balance (cell G22):
=LET(
data, HSTACK(F22:F24,G12:M14+G17:M19),
cId, SEQUENCE(,COLUMNS(data)),
arr, MMULT(--data,--(DROP(cId,,1)>=TOCOL(cId))),
VSTACK(arr,BYCOL(arr,SUM))
)
Cheers!
My recollection is that MMULT used with an upper triangular matrix can be remarkably efficient with SCAN only outperforming it for very large problems.
On a separate issue, I used your IF({1}, ...) broadcast formula recently to build a 2D array of FIFO results to cover multiple companies and products. The alternative would have been to introduce the additional dimensions as separate nested steps but broadcasting allowed me to introduce the 2D array of 2D arrays with a single application of MAP, plus a sprinkling of thunks.
The OP was horrified, but that's life!
'actuals and forecasts'
I suspect that to try to combine 'actuals' (which feed into and sit alongside assumptions) with forecasts is to introduce self-inflicted damage. By all means bring them together at the end by using HSTACK, as part of the presentation layer, but I wouldn't try to combine the calculations.
A further observation on my approach to the 'array of array' problem (hopefully a temporary problem, but who knows). My objective is to create 'universal' approaches that work as the Microsoft helper functions should have worked, and with the computational tricks hidden from the user. I have also tried to avoid methods that are specific to a particular class of problem. As an example, it is easier to find approaches for summation, but if multiplication also enters the picture (as for interest payment or other growth assumptions) the problem becomes harder and it can be a case of 'back to the drawing board'!
Please accept my apologies for inflicting my thoughts and opinions on you somewhat uninvited!
- James_BuistOct 25, 2024Brass ContributorI played around with the options for integrating actuals with forecasts and tweaked my original thought process of having another input to the corkscrew of the actual closing balances. Initially I thought this would be straightforward but of course was wrong. What I ended up doing was to feed in the delta of the closing balances and as actuals would only have values in historic periods and forecasts only have values in forecast periods (due to the flags) it worked out well. This way I can just feed the entire closing balance into the Balance sheet with no further logic.
- James_BuistOct 25, 2024Brass ContributorNo apologies needed. Inflict away. Much better that way. But with regard to combining actual and forecasts, the corkscrew is really only needed for the forecasts other than the most recent closing balance. I haven't fully decided but I may feed the actuals into the middle section just above the closing balance as a line item. Flags control this. Then the CB will be the sum of Inflow, Outflows and Actuals. As there will be no inflows and outflows in the actuals periods that means the CB will be the actuals CB and in the forecast periods, the actuals will be zero so again that will work. So no special calcs needed. With an independent opening balance then the rest of the corkscrew just flows
- djclementsOct 23, 2024Bronze Contributor
Good stuff! I'm glad to hear you found an application for the broadcasting method. Reception so far has been mixed... some love it and some don't. Most users tend to stick with what's familiar and avoid straying outside their comfort zone.
MMULT works well in many different situations, but it's not without its drawbacks. For starters, it can be difficult to get the x and y axis in the right direction. I've also noticed it has a tendency to fail with the 'Excel ran out of resources' error when working with larger datasets, whereas the newer dynamic array helper functions will finish processing the results, even though they are typically slower.
On a side note, data structure can make or break any spreadsheet. A good structure (normalized data in related tables) can be easy to query, summarize and output as needed, whereas a poor structure (e.g. cross-tabulated data) will usually require more advanced methods to manipulate. Many of these questions/scenarios could potentially be avoided simply by organizing your data in a better structure (although that's just speculation based on the skeleton sample data provided at each turn). 😉
- James_BuistOct 24, 2024Brass Contributor
I agree. I am trying to build a new version of an older type of best practices financial model using DAs through out. And they can enable one to skip a few steps and rather than having three sections for each key entry one can reduce it to one. But also, I debated for ages on the handling of multiple entities - for smaller organisations - and decided that there was more flexibility in combining the entities into a single group and doing each operation on the group as a whole rather than having separate sheets for each entity. Often easier to group by part - Revenue, Overheads, Working Capital calcs separated by sheet and do all entities in each group. This way one can just expand the group zones as required. Consolidations by entity have been more of a challenge and those you have hugely helped me with, But input sections and line item processing can be independent of entity and just needs an entity code to enable aggregation or consolidation. As you will have seen from my samples. Clearly I am just giving a small subset of the model to demonstrate the requirements for each post. The flow from inputs to middle tier calcs and the final outputs is one directional unless especially required. ie taking a feed from an output back to a mid tier is avoided even if it may seem more convenient. I hate bi directional references; sheet refs should flow in one direction only between sheets and typically top to bottom on sheets.
However, the grouping of entities has definitely posed a number of additional challenges as often a single row is less complex to calculate using a DA than a full 2D array. At least for me at my level. But I do like the challenge and have and am learning a huge amount from it - PeterBartholomew1Oct 23, 2024Silver Contributor
The data structure arguments may be a bit more nuanced than you suggest. Certainly SUMIFS and friends are blisteringly fast running over huge datasets and picking out relevant values for aggregation. On the other hand, if you know you want every 12th value from the outset, having the data wrapped so that every value you require is to be found in column 12 is both fast and open to visual inspection.
Possibly
= BYCOL(crosstab, SUM)
isn't as fast as
= SUMIFS(tableValues, tableMonths, CalendarMonths)
but it is still pretty effective. What it won't do is process irregular lists.
- djclementsOct 24, 2024Bronze Contributor
Without seeing the source data or knowing the full scope of the project, I can't say for sure how I would go about building the data model. In my experience, though, a normalized data structure is much easier to work with (and can be output in crosstab format when needed). Regarding this particular question, opening and/or closing balances, as well as net transactions for the period, could easily be returned with a simple SUMIFS. Moreover, it would be just as easy to merge actuals with forecasted data using a simple IF statement.
I'm reminded of another thread from a couple months ago, Counting employees during each hour of the day, where data was being entered into a crosstab structure. I had prepared a potential solution at the time, but didn't share it because the OP never responded to my additional inquiries, or to anyone else's suggestions for that matter. While there were many clever and viable solutions presented, they were all far more complex than what would have been needed with a normalized dataset. The "Summary" sheet in the attached file demonstrates how a single COUNTIFS formula was able to get the job done...
Kind regards.