Forum Discussion

Wyn Hopkins's avatar
Sep 13, 2016

Tables are fantastic but cumulative totals are a pain

Cumulative Totals in Tables

Excel Tables have too many benefits to list here, if you've never used them then check out my article here

Tables = Ctrl +T

 

One frustration I do have though is that cumulative totals in a column are not straightforward.

 

In "normal" Excel to get a running total you just add the cell above to the cell on the current row, and that's it.

 

For a table, with it's structured references and headings this proves problematic.

 

Also, ideally you want a running total that works when you insert rows into the Table or add lines of data to the bottom of it.

 

*** Update 6/12/2019***

 

My preferred approach is now shown in a post further down the responses

 

https://techcommunity.microsoft.com/gxcuf89792/board/message?board.id=ExcelGeneral&message.id=45961#M45961

 

and I've attached a demo file here

*******************************************************************

 

 

This was my original post...

 

So here are 2 options, and both make use of the INDEX function.

INDEX has several uses and one of them is to make a dynamic range.

This is a very odd concept when you first see it written

 

e.g. =SUM(A1:A10) can be re-written with an INDEX formula as

 

= SUM( INDEX(A:A , 1) : INDEX(A:A , 10) )

 

Essentially it's saying SUM everything between the 1st cell of A:A and the 10th cell of A:A

 

Stick with me, like I say, I know it's unfamiliar and a little weird.

 

Let's apply this to a Table that has weekly units sold for January and February

To keep things simple I've set it so that we've sold 1 unit a week.

 

So the formula we put in the Cumulative Sum column is

 

=SUM( INDEX( [Units Sold] , 1) : [@[Units Sold]])

 

So it adds up the values from row 1 of the Units Sold column as far as each row (that's where the @ [Units Sold] comes in)

 

I'll agree it's not a very attractive formula but it works and I wish there was a simpler way.

 

Importantly it works if you insert rows or add more data to the bottom of the Table. Brilliant!

 

Option 2 : A little more advanced

 

If you've stuck with me this far then it's worth reading this extra bit...

 

The SUM approach has a drawback.

 

If you filter the table for February you still get the YTD cumulative rather than just the February Cumulative

 

An alternative is the little known AGGREGATE function (you could also use SUBTOTAL).

 

=AGGREGATE( 9, 7, INDEX( [Units Sold] , 1) : [@[Units Sold]] ) )

 

The 9 means SUM

 

The 7 means ignore Hidden Rows AND Errors

 

So when the data is filtered the rows are hidden, and the Aggregate works nicely. Even if one of your values was an Error e.g. #VALUE or DIV/0 the cumulative would ignore it and the total still works.

 

This ignoring of elements can come in really useful, but just be careful that you really do want to do this! Spotting errors are normally a good thing.

 

Any suggestions of alternative approaches would be appreciated.

 

 

Hope you find this useful

 

Wyn

My other Linked In articles

 

AMAZING EXCEL SOLUTIONS
www.accessanalytic.com.au
  • Nice write-up, but it's missing one essential thing:

     

    Readers, be aware that for large data sets, this technique can slow down Excel performance when the sheet is recalculated. In every row, the formula will sum from the first table data row to the current data row. If you only have a few hundred rows, that's still fast. But if the data set goes into the hundreds of thousands of rows, you will notice a performance hit. In each row, Excel has to calculate an ever growing range of cells.

     

    But in each row it also adds numbers that have already been added. This is unnecessary repetition of calculation and very inefficient.

     

    In a table with 500,000 rows there are 499,990 formulas. In row 3 the formula processes 2 cells, in row 4 the formula processes 3 cells, [...], in row 499,999 it processes 499,998 cells, and in row 500,000 the formula processes 499,999 cells. That's a total of 125,000,249,999 to process, and that will take time.

     

    A faster approach can be realised by adding just the previous row's total with the current row's sold units. That way, each of the 499,999 formulas calculates only two cells, 1 Million in total. Or, in a scenario with a debit and credit contributing to the balance, three cells, which would be 1,5 million cells to process.

     

    Unfortunately, structured referencing does not have a handle for "the previous row", so we still need to revert to A1 notation to do this. Consider the following screenshot:

     

    The first scenario calculates the running total with this formula starting in E3:

     

    =SUM(E2,D3)

     

    The Sum()function ignores text values, so in the first instance of the formula, E2 is ignored and D3 is added to the total.

     

    The green table has the typical credit/debit pattern with the current balance. Here the formula in J3 is

     

    =SUM(J2,H3,-I3)

     

    Another way to write this formula would be

     

    =N(J2)+H3-I3

     

    The plus and minus operators don't tolerate text and will return an error if any of the cells contain text. The N() function converts a cell value to a number and will return 0 if it contains text, so the formula will not return an error in the first data row, despite using plus and minus operators and J2 referring to the column title, which is text.

     

    One drawback with this solution is that when a row is inserted in the middle of the table, the formula in the inserted row will have a wrong reference. But Excel picks this up and sets the green warning triangle, which when clicked informs that the formula is inconsistent. A correct formula can quickly be restored by clicking the warning drop-down and selecting "Restore to calculated column formula".

     

    In many situations that approach is faster than waiting for over a billion cells to be processed, but it can be argued that it's easy to miss the warning. 

  • Fair point @Ingeborg. An alternative for bigger data sets that can handle inserted / deleted rows would be this: I selected cell A2 and then defined the name "RowAbove" as = N(A1) Then my cumulative formula in the Table can be set as RowAbove + [@Units Sold]
    • PeterBartholomew1's avatar
      PeterBartholomew1
      Silver Contributor

      You can also do the same thing with an old-style CSE array formula.  By defining an offset range 'PriorBalance' to do the job of your relative reference 'RowAbove', one can use formulas such 

      = IF( Initialise?, CarriedForward, PriorBalance ) + Credit

      for accumulation.  It is robust to any drag-and-drop that may be used to adjust the credit amounts and no one is going to insert a row through it!  Such formulas are out of spec for a modern dynamic arrays though.  Neither exist within the Table though.

       

      p.s. 'Initialise?' simply holds the first row test

      = PriorBalance ="Balance"

  • Wyn Hopkins 

    This is great stuff!

    I have been struggling with references in tables to cells in previous rows. I have always used normal cell reference and that worked because in those particular cases I do not need to insert rows in the middle of the table or even sort the table. This tip of yours will be very useful in the future.

    Thank you for sharing this!

  • Wyn Hopkins

    I guess this article is one that will just run and run despite its original date!

    A solution I adopted was to define a named reference 'prior' that refers to

    =Sheet1!R[-1]

    in R1C1 notation (i.e. the row above).

     

    The accumulation may then be expressed using Range intersection

    = [@amount] + IFERROR(prior [balance], 0)

    where the error traps the NULL reference that results from applying the formula to the first row.

     

    To accumulate over a filtered set, one could introduce a helper column 'selected?' to indicate whether the row is filtered or not (using a form of COUNT)

    =SUBTOTAL(103,[@amount])

    The accumulation then becomes

    = IF([selected?], [@amount], 0) + IFERROR(prior [balance], 0)

     

    There remains the problem that a Table is designed to hold an unsorted list whereas accumulation is essentially an array operation (integration) and requires the terms to appear in the correct order.  To 'harden' the calculation in the presence of Sort is another challenge since 'prior' then has to pick  out the record that immediately precedes the  current record in terms of date, not position.

    • Celia_Alves's avatar
      Celia_Alves
      MVP

      PeterBartholomew1 

      Thanks a lot for sharing this. It will be of great help in the future. Some of the work that I do includes calculators for financial models. I tend to use tables, not because I need to filter or sort, but mostly because formulas will make much more sense. Very often I need to refer to cells in the previous rows of the table. Next time, I will use the tricks that you both shared. ;-)

    • SergeiBaklan's avatar
      SergeiBaklan
      MVP

      PeterBartholomew1 , I didn't catch why do you need helper column. IMHO,

      = SUBTOTAL(4,[@amount]) + IFERROR(prior [balance], 0)

      shall work (or 9, 109, etc)

       

      • PeterBartholomew1's avatar
        PeterBartholomew1
        Silver Contributor

        SergeiBaklan 

        Agreed.  It was just the way I developed the solution.  I should have reduced it to one of your suggested formulas rather just than settling on the first solution that worked. :-(

    • Celia_Alves's avatar
      Celia_Alves
      MVP

      Hi, PeterBartholomew1 and @Wyn Hopkins!

      Could you please explain the steps in more detail?

      When I try to set the name prior as referring to =Sheet1!R[-1], I get an error message saying that the formula is not valid.

      I tried =Sheet1!R[-1]C but I got the same error.

      If I set up the name by selecting A2 and referring to A1 (no dollar sign), the formula is accepted. But I cannot get it to work with the table field names as prior [balance]. It will only work if I use the name alone (no table field associated) to refer to the previous row on the same column.

       

      Any clarification/help would be very much appreciated. 

      Thanks! :)

       

       

      • Wyn Hopkins's avatar
        Wyn Hopkins
        MVP

        Celia_Alves 

         

         

         

        With the R[-1] method I may be wrong but I think you have to go to Options > Formulas and set it to R1C1 method

Resources