Jan 16 2022 06:37 AM
I obtained a strange result in a simple Excel worksheet.
Can you explain it?
Jan 16 2022 07:04 AM
This is a result of the way Excel handles numbers - see Floating-point arithmetic may give inaccurate results in Excel
The solution is to use the ROUND function.
The numbers used in your calculation have at most 2 decimal places, so you should round the result of the formula to 2 decimal places too: change
=expression
to
=ROUND(expression,2)
Jan 16 2022 11:44 AM - edited Jan 16 2022 03:54 PM
@Florin2185 .... To expand on Hans's response....
The MSFT document that Hans refers to contains many misleading statements. Chief among them is: Excel stores "only 15 significant digits of precision". Wrong!
After all, if that were true, there should be no problem, because your numbers have at most 10 significant digits.
What is true is: most decimal fractions cannot be represented exactly in binary floating-point, so they must be approximated. Moreover, the approximation of a particular decimal fraction might vary, depending on the magnitude of the value. That is why, for example, 10.01 - 10 = 0.01 returns FALSE(!).
However, often, we cannot see the exact decimal representation of the binary approximations because Excel formats (displays) only up to 15 significant digits, rounded, replacing any digits to the right with zeros.
For example, what appears to be 53660196.7 in A25 is actually 53660196.699999988079071044921875.
But the value in A25 seems to be the result of effectively SUM(A15:A24); and the sum of those approximations might differ from the approximation of the constant 53660196.7.
In fact, the exact decimal representation of the binary approximation of the constant 53660196.7 is 53660196.70000000298023223876953125.
And the difference between SUM(A15:A24) and 53660196.7 is indeed -0.0000000149011611938477, which is the value (rounded) that you display in A26.
Aside....
We can calculate the exact difference with an expression like SUM(A5, -53660196.7). We cannot write simply =A25-53660196.7 because in that context (only), Excel misleadingly replaces the actual arithmetic difference with exactly zero.
Also, A25=53660196.7 misleadingly returns TRUE because for the comparison operators (only), Excel rounds each operand internally to 15 significant digits just for the purpose of the comparison. In contrast, note that A25-53660196.7=0 and ISNUMBER(MATCH(53660196.7,A25,0)) correctly return FALSE.
BTW, you will find similar infinitesimal differences for the sums in A4 and A9, but not for the sum in A13 by coincidence.
And FYI, the sums in A13 and A25 exclude the values in A10 and A14. I don't know if that is intentional or a mistake.
Jan 16 2022 09:32 PM
Jan 17 2022 04:59 PM - edited Jan 18 2022 12:38 AM
@Florin2185 wrote: ``I can send you the Excel file itself.``
Got it in a PM. It confirms what Hans and I had explained. In summary, the formulas should be:
A4: =ROUND(SUM(A1:A3), 2)
A9: =ROUND(SUM(A5:A8), 2)
A13: =ROUND(SUM(A11:A12), 2)
A25: =ROUND(SUM(A16:A24), 2)
A26: =ROUND(A4+A9+A10-A13-A14-A15-A25, 2)
Ironically, the infinitesimal result in A26 is because of the unnecessary parentheses around the original formula. If you had written simply =A4+A9+A10-A13-A14-A15-A25, the result would have been exactly zero. But that is not a reliable solution, because it depends on a quirk of Excel that is unpredictable.
Also note that many of the "constants" appear to be the result of calculations that were copy-and-pasted-value, or the equivalent. Their actual binary value is not the same as the binary approximation of the decimal number rounded to 15 significant digits. For example, what appears to be 308635.86 in A1 is actually 308635.86 - 5.82E-11, an infinitesimal difference.
If you want those constants to be accurate to 2 decimal places, it would be prudent to also explicitly round those calculations to 2 decimal places.
Caveat: Someone might sugggest setting the option "Precision as displayed" in order to avoid explicitly rounding formula results. I discourage that for a lot of reasons. But if you choose to expermient, be sure to make a copy of the Excel file before setting that option. Merely setting the option might have unintended consequences that are irrerversible.
See the attached file for more details. Let me know if you would like further explanation.
Jan 18 2022 08:23 AM
Jan 18 2022 12:22 PM - edited Jan 18 2022 01:46 PM
@Florin2185 wrote: ``how can I be sure the result is accurate when I have thousands of data?``
That issue goes beyond the decimal-to-binary issues that we are discussing here. Even when the raw data is presented in decimal (typically, a CSV file), the precision of the data is an individual judgment call.
And BTW, that is one the key reasons why I discourage the use of "Precision as displayed". It might alter the raw data before I have had a chance to apply my own judgment.
There is no consistency, even within the financial industry. For example, for the same security, finance.yahoo.com usually reports historical prices accurate to the cent when we export the data into Excel, whereas schwab.com might report prices accurate to 4 or 6 decimal places.
Occassionally, I see finance.yahoo.com data with more decimal places. If it is of the form 110.019997 or 110.050003, I assume that reflects binary-to-decimal conversion problems of their own, unrelated to Excel. I assume such data is accurate to 2 decimal places. OTOH, if the data is of the form 108.356598, I assume the data is accurate to 4 or 6 decimal places.
Obviously, I cannot completely automate such decisions. Or when I do, I will choose an arbitrary precision for all data, typically 2 or 4 decimal places.
But that is my own "individual judgment call". You might choose differently. There is no "right" or "wrong" answer, IMHO.
(But when the data includes number of shares and market value, I try to chose a precision for unit price and number of shares so that their product equals the market value, rounded to 2 decimal places. )
-----
@Florin2185wrote: ``When Excel computes the mean or variance or correl or covar [....]``
My rule is: I ensure that the raw data is accurate to an arbitrary precision ("judgment call").
But generally, I have no expectations of the precision of such calculations, unless I am presenting them for instructional purposes, and it is important that readers can reproduce my results.
That is, generally, I use the full binary result from such function calculations, unrounded. Often, that is necessary in order for the Excel calculation to match mathematical expectations, as well as we can.