Forum Discussion
Tables are fantastic but cumulative totals are a pain
Nice write-up, but it's missing one essential thing:
Readers, be aware that for large data sets, this technique can slow down Excel performance when the sheet is recalculated. In every row, the formula will sum from the first table data row to the current data row. If you only have a few hundred rows, that's still fast. But if the data set goes into the hundreds of thousands of rows, you will notice a performance hit. In each row, Excel has to calculate an ever growing range of cells.
But in each row it also adds numbers that have already been added. This is unnecessary repetition of calculation and very inefficient.
In a table with 500,000 rows there are 499,990 formulas. In row 3 the formula processes 2 cells, in row 4 the formula processes 3 cells, [...], in row 499,999 it processes 499,998 cells, and in row 500,000 the formula processes 499,999 cells. That's a total of 125,000,249,999 to process, and that will take time.
A faster approach can be realised by adding just the previous row's total with the current row's sold units. That way, each of the 499,999 formulas calculates only two cells, 1 Million in total. Or, in a scenario with a debit and credit contributing to the balance, three cells, which would be 1,5 million cells to process.
Unfortunately, structured referencing does not have a handle for "the previous row", so we still need to revert to A1 notation to do this. Consider the following screenshot:
The first scenario calculates the running total with this formula starting in E3:
=SUM(E2,D3)
The Sum()function ignores text values, so in the first instance of the formula, E2 is ignored and D3 is added to the total.
The green table has the typical credit/debit pattern with the current balance. Here the formula in J3 is
=SUM(J2,H3,-I3)
Another way to write this formula would be
=N(J2)+H3-I3
The plus and minus operators don't tolerate text and will return an error if any of the cells contain text. The N() function converts a cell value to a number and will return 0 if it contains text, so the formula will not return an error in the first data row, despite using plus and minus operators and J2 referring to the column title, which is text.
One drawback with this solution is that when a row is inserted in the middle of the table, the formula in the inserted row will have a wrong reference. But Excel picks this up and sets the green warning triangle, which when clicked informs that the formula is inconsistent. A correct formula can quickly be restored by clicking the warning drop-down and selecting "Restore to calculated column formula".
In many situations that approach is faster than waiting for over a billion cells to be processed, but it can be argued that it's easy to miss the warning.