Forum Discussion
JennyLMace
Mar 31, 2024Copper Contributor
#NUM! error with chi-squared test statistic
Dear All, I have been following the instructions on this YouTube video for executing the chi-squared test in Excel. It has been working well the first few times I've used it; however, now I ha...
JennyLMace
Mar 31, 2024Copper Contributor
Thanks, Joe! But this doesn't seem to be an option available to me - neither in this reply nor if I 'edit' the original message... Many thanks 🙂
JoeUser2004
Mar 31, 2024Bronze Contributor
I still think it would be prudent for you provide the Excel file, working around the "stoopid" forum limitation as I explained previously.
But I might have an answer for you without it. You might confirm it on your own.
Previously, I wrote: "format G20 as Scientific to see the [actual] p-value. 0.00E+00 is exactly zero. CHISQ.INV.RT [with 8 degrees of freedom] can work with p-values as low as 1E-60. But your p-value shouldn't be that low".
Apparently, that last assertion is wrong. But the key is indeed to format the p-value as Scientific, not Number, in order to see its actual value.
Using the video that you cited (which is very good, BTW), I generated random data that actually has a strong dependency on having a college education. My p-value is about 3E-110 (!).
So the p-value certainly can be "that low".
Nevertheless, CHISQ.INV.RT works in my case, probably because the degrees of freedom is 2, not 8.
But my point is: in your case, if there is a strong dependency among the data, perhaps the p-value is indeed so small that CHISQ.INV.RT will return #NUM.
Off-hand, I cannot say whether that can or should not be the case. Perhaps not mathematically. But perhaps it is a numerical compuational limitation.
Again, without the Excel file to look, I can only speculate.
- JennyLMaceMar 31, 2024Copper ContributorThanks so much for your enthusiasm to help, Joe! Here is a copy of the content on a file: https://www.dropbox.com/scl/fi/6g7d2fhu2lw8aiznyhfef/Copy-of-NUM-problem.xlsx?rlkey=31le8cnmar61hav2v9l9u4n0l&dl=0
Let me know if you have any trouble accessing this; I've shared on Dropbox before, so I'm hoping you'll access it smoothly. In any case, I can indeed confirm that the p-value equates to zero when switching to the scientific format!
I look forward to receiving your insights. I'm only on the second of numerous of these tests, so if I'm experiencing this problem now, I fear it'll only be likely to happen again. Maybe on SPSS, I wouldn't have this issue?
Many thanks indeed (I'll pick up any replies tomorrow now),
Jenny- JoeUser2004Apr 01, 2024Bronze Contributor
I was right: chisq.test returns exactly zero for the p-value because it encounters a numerical computation limitation. But that should not affect your interpretation with respect to the null hypothesis.
And I suspect that SPSS would encounter the same numerical limitation, since most applications use the same internal numerical representation (64-bit binary floating-point).
On the other hand, the #NUM error from chisq.inv.rt seems to be due to an arbitrary limitation of the internal algorithm. SPSS might behave differently.
But since we have the actual and expected data, we can calculate the chisq statistic directly with the following formula in G21:
=SUMPRODUCT( (G4:I8 - G12:I16)^2 / G12:I16 )
That returns 1453.99071.
Aside.... You can replace SUMPRODUCT with SUM in "dynamic-array aware" versions of Excel. That might be clearer.
-----
TMI.... With the chisq statistic in G21, we could calculate the p-value in G20 with the formula =CHISQ.DIST.RT(G21, 8). Of course, we might as well use CHISQ.TEST. But the CHISQ.DIST.RT formula gives us insight into the numerical limitation.
As we arbitrarily increase the chisq statistic from 1452, we see that the p-value gets infinitesimally smaller until it reaches 2.22508E-308 when the chisq statistic is 1452.74596. 2.22507E-308 (2^-1022) is the smallest number that we can represent with 64-bit binary floating-point. (But Excel does not allow us to enter 2.22507E-308 manually, an arbitrary limitation.)
Likewise, we find that 4.24870E-63 is the smallest p-value that works with =CHISQ.INV.RT(G20, 8). With that arbitrary p-value in G20, CHISQ.INV.RT returns the chisq statistic 314.0252679. Obviously, that is significantly less than the numerically-limited value of 1452.74596.
- JennyLMaceApr 01, 2024Copper ContributorHi Joe,
Many thanks once again!
I realised one can run chi-squared tests in SPSS on uncoded 'word' data i.e. string data as SPSS calls it. So, I thought I'd quickly import my data to SPSS to see if a similar error message occurred. I'm pleased to report it did not, so I guess I'll continue on SPSS. Interestingly, the chi-squared statistic that SPSS returns is 1462.463 (vs 1453.99071 as per your equation above). I thought you mind be interested to know this. Obviously they are quite similar but nevertheless different...
Best wishes,
Jenny