Forum Discussion

James_Buist's avatar
James_Buist
Brass Contributor
Oct 21, 2024

Return of the Corkscrew with Spilled Arrays

 

I'm starting a new topic but do want to refer to PeterBartholomew1 post  on accumulators here

https://techcommunity.microsoft.com/t5/excel/ways-of-performing-accumulation-with-dynamic-arrays/m-p/2329035

This post was in 2021 and things have or may have moved on since then and I'm wondering if there is an easier way to achieve the following.

The Corkscrew. I have seen a number of addins and other 5G functions which do far more than I want. I'm basically looking for a way to break the circularity in a corkscrew and it seems that if I can calculate the top row ie typically the opening balance without referring to the closing balance then all would be good.

 

To add complexity I’m trying to fuse the actuals and forecasts in this single function. I have flags to determine the actuals and can use NOT actuals to get the forecasts (same flags) and can pic up the actual opening balance from the source data easily. As I will calculate all time periods for both I can then just add them (as actuals will be multiplied by the Actual flag and thus zero for forecast periods and vice versa for the forecasts. So I just need to calculate the forecast period which will be the same for the entire array block. The key is. getting the previous value. (for the inflows and outflows as they are already calculated I can just shift their arrays forward I col but adding a column in front to get the previous value so that is easy.

 

I can then get the closing balance by adding the inflows and outflows that would fall in the middle of the corkscrew to this previous value to simulate an opening balance. The inflows and outflows can then be added easily and the closing balance simply summed. So all the clever work would go in the opening balance.

 

The added complexity I have is that I am doing a multi row version. ie there are ‘currently’ 3 entities all with opening balances, all with inflows and outflows and closing balances. I want to calculate these in blocks of these entities – 1 row per entity but in a single dynamic array. But the corkscrew will be made up of 4 dynamic arrays – opening bal, inflows, outflows and closing.

 

If someone has a better logic here then do let me know. I saw Jeff Robsons one from a few years back using sumifs for the middle bit. But again, I want to do these in separate blocks which does add a layer of complexity. The reason is that I can ( have) modelled with multiple entities together rather than separately which means one just needs to extend the blocks (VBA to the rescue as there will be a good number of them) as required and inputs can just be assigned to the respective entity (often locations around the world might just be a tiny rep office with hardly anything going on so no need for a fully blown model but can be detailed separately. I also have the ability to enable or disable entities so they are included or excluded from the calcs.

 

SO back to the problem, Model attached. Blue cells mark the dynamic array function for the block. The inflows and outflows will come from other dynamic arrays. But this idea means that one can expand the middle if the top/bottom are taken care of with regard to the circularity.

 

I’m trying to get something easy to implement ideally in a single function (with or without LET)

  • James_Buist Just for good measure, MMULT can also get the job done with minimal effort.

     

    Closing Balance (cell G22):

    =LET(
        data, HSTACK(F22:F24,G12:M14+G17:M19),
        cId, SEQUENCE(,COLUMNS(data)),
        arr, MMULT(--data,--(DROP(cId,,1)>=TOCOL(cId))),
        VSTACK(arr,BYCOL(arr,SUM))
    )

     

    Cheers!

    • James_Buist's avatar
      James_Buist
      Brass Contributor

      Great. An entire collection. I used the first scan one. I have GroupBy but many don't yet have it but good to start getting some experience with it. The first Scan one is nice and simple to implement and doesn't require any named functions to be created. The second is probably more sophisticated but it can be easier for users not to have named functions unless they are quite simple. I find editing them still a bit of a pain even with the Excel labs editor. SO I tend to write functions using LET and then wrap in lambda and keep an original with indentation etc for easy reading. But there are probably better ways.
      I was also not surprised when you added the MMULT. You are a master with those and they are so slick and fast. I've deployed both in the model and will decide later which to keep.
      Many thanks again for the neat and succinct solution.

       

      PS I meant to add. I just used the Scan function for the Opening Balances as the closing could just be a sum of each part. I deliberately wanted to keep each section separate for added flexibility so I can model in multiple different ways but with the same component parts. Some corkscrews may have 5 or 6 middle sections so this way the function can just be tweaked to accommodate these especially if each block is just a single cell ref with a #.

  • djclements's avatar
    djclements
    Bronze Contributor

    James_Buist Just for good measure, MMULT can also get the job done with minimal effort.

     

    Closing Balance (cell G22):

    =LET(
        data, HSTACK(F22:F24,G12:M14+G17:M19),
        cId, SEQUENCE(,COLUMNS(data)),
        arr, MMULT(--data,--(DROP(cId,,1)>=TOCOL(cId))),
        VSTACK(arr,BYCOL(arr,SUM))
    )

     

    Cheers!

    • PeterBartholomew1's avatar
      PeterBartholomew1
      Silver Contributor

      djclements 

      My recollection is that MMULT used with an upper triangular matrix can be remarkably efficient with SCAN only outperforming it for very large problems.

       

      On a separate issue, I used your IF({1}, ...) broadcast formula recently to build a 2D array of FIFO results to cover multiple companies and products.  The alternative would have been to introduce the additional dimensions as separate nested steps but broadcasting allowed me to introduce the 2D array of 2D arrays with a single application of MAP, plus a sprinkling of thunks.

      FIFO Cost calculation with Let and Lamda Formula. | Chandoo.org Excel Forums - Become Awesome in Excel

      The OP was horrified, but that's life!

       

      James_Buist 

      'actuals and forecasts'

      I suspect that to try to combine 'actuals' (which feed into and sit alongside assumptions) with forecasts is to introduce self-inflicted damage.  By all means bring them together at the end by using HSTACK, as part of the presentation layer, but I wouldn't try to combine the calculations.

       

      A further observation on my approach to the 'array  of array' problem (hopefully a temporary problem, but who knows).  My objective is to create 'universal' approaches that work as the Microsoft helper functions should have worked, and with the computational tricks hidden from the user.  I have also tried to avoid methods that are specific to a particular class of problem.  As an example, it is easier to find approaches for summation, but if multiplication also enters the picture (as for interest payment or other growth assumptions) the problem becomes harder and it can be a case of 'back to the drawing board'!

       

      Please accept my apologies for inflicting my thoughts and opinions on you somewhat uninvited

      • djclements's avatar
        djclements
        Bronze Contributor

        PeterBartholomew1 

        Good stuff! I'm glad to hear you found an application for the broadcasting method. Reception so far has been mixed... some love it and some don't. Most users tend to stick with what's familiar and avoid straying outside their comfort zone.

         

        MMULT works well in many different situations, but it's not without its drawbacks. For starters, it can be difficult to get the x and y axis in the right direction. I've also noticed it has a tendency to fail with the 'Excel ran out of resources' error when working with larger datasets, whereas the newer dynamic array helper functions will finish processing the results, even though they are typically slower.

         

        On a side note, data structure can make or break any spreadsheet. A good structure (normalized data in related tables) can be easy to query, summarize and output as needed, whereas a poor structure (e.g. cross-tabulated data) will usually require more advanced methods to manipulate. Many of these questions/scenarios could potentially be avoided simply by organizing your data in a better structure (although that's just speculation based on the skeleton sample data provided at each turn). 😉

    • James_Buist's avatar
      James_Buist
      Brass Contributor

      djclements 

       

      Just realised I've been a bit of a bozo. I modeled my corkscrew on the calcs for Trade Debtors & Creditors where the middle sections are pre-calculated and nothing to do with the corkscrew. The corkscrew is just to get the output to feed into the balance sheet. BUT..

      There is another type in play whereby the corkscrew is used to get the payment amount. eg VAT

      Usually one would use the opening balance in the month of payment due (using flags) and negate it. So then a mid section becomes dependent on an opening and closing balance. I was a bit taken aback when I hit the first circularity that it blurred my mind to the other situations

       

      Have I missed the point in the two calcs one for opening and one for closing balances? I just used opening as closing could just be a standard addition. Should I be using both as just opening works for  trade debtors and even if I use both, the VAT payment still results in circularity. I'm probably being really dumb but I can't see an easy way around this.

       

      I tried using a variation of the opening balance scan solution but need to reset based on the vat payment period - currently in an array of flags. But I can't integrate that reset into the scan function. I like that concept of the flags especially for things like payment periods. Again not sure how to tackle this. Much prefer to keep the corkscrew in separate components for flexibility but can abandon this if a better way. I think this would then wrap up the corkscrew issue

       

      Sample in 08a attached

      • djclements's avatar
        djclements
        Bronze Contributor

        James_Buist I'm afraid I don't really follow your description. There's a lot of jargon being used that's not part of my vocabulary. Rather than trying to decipher the meaning, I went straight to your sample workbook...

         

        Do the mocked-up totals in both the Opening and Closing Balance sections accurately represent your expected results? If so, the flags don't really seem to be playing a part in any of this... you've got multiple flags with no corresponding VAT payments, and some VAT payments with no flags. There doesn't seem to be any rhyme or reason to it. It would be relatively easy to use the flags as a reset point, but that doesn't seem to be what's actually going on in your sample file. If you could provide some clarity to your mocked-up data, I might have a better chance of determining what needs to be done.

    • James_Buist's avatar
      James_Buist
      Brass Contributor

      djclements 

      Replied to your PM but it said I had used up my quota so it never went through. The teams invite bounced too. I'm only brass so perhaps my PM limit is tiny!! I'm also on imesssage and whatsapp or can send invite to another email address. Not sure when I can pm again but can receive. Zoom is also fine if you have it. time suggested works well

  • James_Buist 

    The basic strategy is to start of by calculating closing balances using SCAN.  The catch here is that you are accumulating arrays rather than scalars, so SCAN needs to be fooled by passing it a thunk vector.  The problem of evaluating the resulting thunk array is not trivial.  Opening balances follow by offsetting the closing balances by one period.  Once you have the opening balances, the rest of the model follows as one simple array calculation.

     

    In some situations, like a revolver, you may know how that you are only accumulating two interacting rows of data and it may be better to use two passes of MAP to evaluate debt and cash rows individually.

     

     

  • joelb95's avatar
    joelb95
    Brass Contributor

    James_Buist 

     

    I really don't understand your question, but I have some sense for the area.

     

    I've made some other complicated posts that touch on this topic, so if you are up for some reading, you can take a look at my serialization to preserve state function which is aimed primarily at reducing the computational demands for independent formulas. If you are happy to have just a single dynamic formula that rolls forward in time, everything comes down to structure.

     

    Basically, you are aggregating data which consists of a few parts - an opening balance, inflows, outflows, and closing balance. What gives you freedom to inject complicated logic for how to change the inflows and outflows is making sure that your data is segregated both from your presentation and your calculations. The primary way that I do that is by having an excel table (ie "list object") that has amounts kept more or less as absolute values in a single column with another column that indicates whether the value is an inflow or outflow for the current purpose. So something like ctrl, date, acct, amt, flow.

     

    I left you with the ability to add an optional filter mask for the transactions so that you can adjust the function relatively easily.

     

    This is the data in "tbl_trans":

     

    ctrldateacctamtflow
    11/1/2021first_account                  100I
    21/15/2021first_account                     22O
    32/2/2021first_account             1,000I
    42/22/2021first_account                     44O
    53/15/2021first_account          10,000I
    63/16/2021first_account                     66O

     

     

     

    This is the output:

     

    accountstart_dateend_date opening_balance  inflows  outflows  ending_balance 
    first_account1/1/20211/31/2021                       -            100.00           22.00             78.00
    first_account2/1/20212/28/2021             78.00      1,000.00           44.00      1,034.00
    first_account3/1/20213/28/2021      1,034.00   10,000.00           66.00   10,968.00

     

     

    getDateMask =
    lambda(
        start_date,
        end_date,
        (tbl_trans[date] <= end_date)*(tbl_trans[date]>=start_date)
    );
    
    getAcctMask = 
    lambda(
        acct_id,
        tbl_trans[acct] = acct_id
    );
    
    totalByflow = 
    lambda(
        flow,
        filter_mask,
        [optional_filter_mask],
        round(
            sum(
                filter(
                    tbl_trans[amt],
                    (
                        (tbl_trans[flow]=flow)*
                        filter_mask*
                        if(isomitted(optional_filter_mask),TRUE,optional_filter_mask)
                    )
                )
            ),
            2
        )
    );
    
    
    sumInflows = 
    lambda(
        date_mask,
        acct_mask,
        [optional_filter_mask], 
        totalByFlow("I", date_mask*acct_mask,optional_filter_mask)
    );
    
    sumOutflows = 
    lambda(
        date_mask,
        acct_mask,
        [optional_filter_mask], 
        totalByFlow("O", date_mask*acct_mask,optional_filter_mask)
    );
    
    getPriorBalance =
    lambda(
        accumulator,
        take(accumulator,-1,-1)
    );
    
    getPriorDate =
    lambda(
        accumulator,
        index(take(accumulator,-1),1,3)
    );
    
    calcPeriod =
    lambda(
        first_date,
        edate(first_date-1,1)
    );
    
    calcClosingBalance = 
    lambda(
        opening_balance,
        inflows,
        outflows,
        [optional_add_outflows_df_FALSE],
        if(
            if(
                isomitted(optional_add_outflows_df_FALSE),
                FALSE,
                optional_add_outflows_df_FALSE
            ),
            opening_balance+outflows-inflows,
            opening_balance+inflows-outflows
        )
    );
    
    calcCurrentPeriodData =
    lambda(
        accumulator,
        acct,
        [optional_start_date],
        [optional_filter_mask],
        [optional_end_date],
        [optional_add_outflows_df_FALSE],
        let(
            opening_balance, getPriorBalance(accumulator),
            start_date, if(isomitted(optional_start_date),getPriorDate(accumulator)+1,optional_start_date),
            end_date, if(isomitted(optional_end_date),calcPeriod(start_date),optional_end_date),
            acct_mask, getAcctMask(acct),
            date_mask, getDateMask(start_date,end_date),
            inflows, sumInflows(date_mask, acct_mask, optional_filter_mask),
            outflows, sumOutflows(date_mask, acct_mask, optional_filter_mask),
            closing_balance, calcClosingBalance(opening_balance, inflows, outflows, optional_add_outflows_df_FALSE),
            result,
                hstack(acct,start_date, end_date, opening_balance, inflows, outflows, closing_balance
            ),
            result
        )
    );
    
    running_balance_heading =
        hstack("account","start_date", "end_date", "opening_balance", "inflows","outflows","ending_balance");
    
    makeAcctActivityReport =
    lambda(
        start_date,
        initial_balance,
        acct,
        [optional_total_periods_df_3],
        let(
            accumulator, hstack(expand(hstack("","",start_date-1),1,columns(running_balance_heading)-1,""),initial_balance),
            periods, if(isomitted(optional_total_periods_df_3),3,optional_total_periods_df_3),
            result,
                reduce(
                    accumulator,
                    sequence(periods),
                    lambda(
                        acc,
                        period,
                        vstack(acc, calcCurrentPeriodData(acc,acct))
                    )
            ),
            vstack(running_balance_heading,drop(result,1))
        )
    );

     

     

    Then you make a few helper formulas depending on your needs.

     

     

Resources