Forum Discussion
Mysterious bug on excel calculation, on a vlookup
- Jun 03, 2021
Glarrain wrote: ``A7 and A8 looks like a 100% like the one on A11 and A12``
And looks can be deceiving.
The problem has nothing to do with errant characters (which would result in #VALUE errors, anyway), and using the Excel VALUE or VLOOKUP(...,TRUE) function might work-around the problem only by coincidence.
Instead, this is just yet-another example of binary (floating-point) arithmetic anomalies.
The work-around is: whenever you expect a calculation to be accurate to some number of decimal places, explicitly round to that number of decimal places -- and not to arbitrary number, as some people suggest. For whole percentage values, round to 2 decimal places (because 12% is 0.12). For percentage values that should be accurate to 2 percentage decimal places, round to 4 decimal places (because 12.34% is 0.1234).
It appears that the values in A7 and A8 had been calculated (somewhere), then copy-and-pasted-value into A7 and A8. And the calculated values are infinitesimally smaller (-1.11E-16) than how the values appear, even when formatted to display 15 significant digits. We can see such differences with formulas of the form =SUM(A7,-(A7&"")) formatted as Scientific or General.
(That works because the expression A7&"" returns the value formatted with up to 15 significant digits. That is Excel's arbitrary formatting limit. It is not the limit of numeric precision, as many documents state incorrectly.)
Additionally, VLOOKUP is among the many Excel functions that compare exact binary values, not the values arbitrarily rounded to up to 15 significant digits, as compare operators ("=", ">=", etc) and the *IF[S] functions do (COUNTIF, SUMIFS, etc).
Since the value in A7 is infinitesimally less than the value in A4 (exactly 100%), VLOOKUP(...,TRUE) returns the value in column B (B3) that corresponds to the largest value less than 100%, namely 95% in A3.
Since the (same) value in A8 is infinitesimally different from the value in A4, VLOOKUP(...,FALSE) returns #N/A because it fails to find an exactly binary match.
In both cases, one work-around is to use ROUND(A7,4) and ROUND(A8,4) if you want to allow for accuracy to fractional percentages (2 percentage decimal places), as you format them.
Another solution is to explicitly round the calculations that sourced the values in A7 and A8. You do not show us those formulas.
Neither solution is more "right" or "wrong". The choice is yours to make, depending on your intentions.
The real values in A7 and A8 are 0.99999999999999989 and 0.99999999999999989.
The values were probably copied from another source.
Detlef_Lewin and JoeUser2004 , you both indicate that the value is a round off error. how did you determine that? I expanded the number of decimal places and did not see that:
is there some other trick i should be using to check for that?
Furthermore, that doesn't explain why NUMBERVALUE() or --TRIM() applied to that cell would fix that issue since if it really is 0.999999... neither of those functions would change or fix that.
- JoeUser2004Jun 03, 2021Bronze Contributor
(With critical corrections for posterity.)
mtarler wrote: ``that doesn't explain why NUMBERVALUE() or --TRIM() applied to that cell would fix that issue since if it really is 0.999999... neither of those functions would change or fix that``.
Sorry, I overlooked that question.
They work-around the problem because they cause Excel to return the value rounded to 15 significant digits. In this case, 1 - 2^-53 (approx 0.99999999999999989) rounds to exactly 1, which is a binary match for the value in A4 (100%).
But as I said before, that is only by coincidence.
Consider the following example.
Enter =1 + 2^-52 into A4. That might be the result of a calculation that we expected to be exactly 100%.
Then, with =1 - 2^-53 in A7, note that VLOOKUP(NUMBERVALUE(A7),A4,1,FALSE) and VLOOKUP(NUMBERVALUE(A7),A4,1,TRUE) still return #N/A (no match).
The reason is: NUMBERVALUE(A7) rounds A7 to exactly 1 (100%), but it does not match the exact binary value in A4.
In contrast, if we enter exactly 1 into A4, both VLOOKUPs return 1 (match) because the binary values now match.
- mtarlerJun 03, 2021Silver ContributorThank you for your explanations. Yes I finally did go into the XML file and see the actual stored value as 0.9999....98. The round off error was my first thought but naively thought I would be able to see it if i expanded the precision enough. And yes you must be right that both NUBMERVALUE() and --TRIM() must be forcing the value to be read as text and then converted back to a number and result in 1.0 (100%) as the value. That said, VALUE() function does NOT work the same and does not appear to force that conversion/rounding. But I did test and see that even if I type =1.0-1.11E-16 into a cell and then force Excel to show 20+ digits it still displays as 1.00000000... or 100.000000....% but then using those formulas it presents the same problem. I even tried tricks like looking at the cell using =TEXT(1-(1E-20),"0.0000000000000000000000") or =TEXT(1-(1E-20),"0.00000000000000000000000E+00") and excel still refuses to cough up the actual full precision value. I wish there was a way to make excel cough that up without going into the XML...
- JoeUser2004Jun 03, 2021Bronze Contributor
mtarler wrote: ``you must be right [about] NUBMERVALUE() [... but] VALUE() function does NOT work the same and does not appear to force that conversion/rounding``
You are right. I made a lot of typos in that posting: substituting VALUE for NUMBERVALUE; substituting VLOOKUP for MATCH -- or better, using VLOOKUP correctly for comparison; using decimal approximations (1.11E-16) instead of the exact binary values that I intended (2^-53); and intending to use 1+2^-53 ("1+1.11E-16" sic) in A4 instead of 1+2^-52 (approx 1+2.22E-16).
I think you got the gist of it (``you must be right that both NUBMERVALUE() and --TRIM() must be forcing the value to be read as text``). But I might correct my previous posting for posterity.
-----
mtarler wrote: ``I even tried tricks like [...] =TEXT(1-(1E-20),"0.00000000000000000000000E+00") and excel still refuses to cough up the actual full precision value``
What part of ``Excel formats up to 15 significant digits. That is Excel's arbitrary formatting limit`` did you not understand? (wink)
Oh well, with all of my previous typos noted above, perhaps I deserve a little distrust (sigh).
BTW, -1E-20 is too small to make a difference, anyway. Note that MATCH(1-1E-20,{1},0) returns 1, indicating a binary match.
In contrast, MATCH(1-2^-53,{1},0) returns #N/A, indicating no binary match. IOW, -2^-53 makes the intended difference. (But +2^-53 does not.)
- JoeUser2004Jun 03, 2021Bronze Contributor
mtarler wrote: ``you both indicate that the value is a round off error. how did you determine that? I expanded the number of decimal places and did not see that``
As I explained: ``We can see such differences with formulas of the form =SUM(A7,-(A7&"")) formatted as Scientific or General``.
FYI, we cannot always use simply =A7-(A7&"") because of other anomalies that are specific to Excel arithmetic. In particular, sometimes Excel replaces the exact infinitesimal difference of a cell formula with exactly zero, if Excel (arbitrarily) deems that the two operands of the last subtraction are "close enough".
And as I explained, arguably not clearly: Excel formats ``up to 15 significant digits. That is Excel's arbitrary formatting limit. It is not the limit of numeric precision, as many documents state incorrectly.``
That is why we cannot see the infinitesimal differences in some cases, as you demonstrated.
Finally, as I explained (7 min before your posting, so you might have missed it), we can see values formatted with up to __17__ significant digits by looking at the XML worksheet in the "xlsx" file, among other methods.
The "xlsx" and "xlsm" files are zip archives. Copy foobar.xlsx to foobar.zip, then double click foobar.zip in Windows, find the XML worksheet, and open it in Notepad. You will see entries like 0.99999999999999989 surrounded by "v" tags.
The 17-digit decimal value is just yet-another approximation of the binary value. But the IEEE 754 standard specifies that 17 significant digits (rounded) is sufficient to convert between decimal and binary representations with no loss of precision.
(BTW, the IEEE 754 standard __never__ mentions anything about 15 significant digits, contrary to most online documents, from MSFT and others.)