Forum Discussion
Your thoughts serializing arrays to manage state in excel (by way of inventory tracking)
Hi Peter,
Thanks for replying. I've attached a primitive example for you - at this point I haven't actually implemented it in a meaningful data set (though I have demonstrated it works across thousands of lines of data).
The motivation is overcoming the limitations of excel's requirement for every cell to return a value, i.e. you can't return a thunk or any other fancy container/wrapper to a cell and then reference that cell for its value. This would, theoretically, allow you to persist the intermediate calculations in involved formulas without having to solve the array of array problems (or using the name manager to name intermediate calculation results).
Obviously some refinement is needed (such as using better row/column delimiters) and if taken seriously, data compression techniques could be automatically implied (removing all but the most essential data and encoding that data to the extent reasonable). If done correctly, with about 30k characters available in a string, you can save quite a lot of information in one of excel's most efficient data structures (strings are saved as entire discrete values and given a single lookup key).
A bigger application of the idea, for instance, would be assigning individual lot numbers to every inflow of a security and being able to update that lots quantity and remaining cost basis according to whatever formula the cell using the array wants, i.e. this time specific identification, that time FIFO, another time LIFO, another time cost averaging, etc. Because the entire lot state is being persisted, a single outflow transaction of securities can relatively simply impact all affected security lots - whether that be one or a thousand. For instance, if you are using an average method over 15 lots of IBM shares, one formula could just search for lots of IBM in the prior state and adjust their values to their new remaining shares/value on a pro rata basis. Because the state was persisted, the entire updating of lots is basically a simple filter of one table that consists exclusively of the updated information for each remaining lot.
Even if this type of persistence were limited to something like once every thousand rows with the inbetween steps requiring full calculation/accumulation of all steps subsequent to the last persisted state, surely figuring out the state after 1,000 transactions is better than figuring it out after 10,000.
Your serialization appears to be well implemented and works. The methods we are developing are far removed from one another and the manner in which they scale is likely to be different. I tend to use thunks and perform entire calculations as one. I haven't given much thought to the way in which I could separate out 'historical' data to leave a smaller dynamic formula to bring the calculation up to date. Perhaps I should!
The idea of serialising the data appears to be considerably more complicated than assigning a field to stats from each lot but it should be more dynamic when new lots are introduced.
I have included the type of calculations I would turn to if faced with a problem such as that you describe, but it is so different that I doubt it will be of much value to you.