How to calculate Pearson correlation coefficient with 2-property arrays?

Copper Contributor

I have the following data:

        A         B           C
1               Group 1     Group 2
2   Property 1     56         651
3   Property 2     97       1,380

According to the Chi-Square Calculator at Math Is Fun, these data have a Pearson correlation coefficient ("p") of 0.2463. But when I use Excel's function pearson() on them:

=pearson(B2:B3, C2:C3)

I get the result 1. I found an explanation of why this is happening in a StackOverflow answer about a calculation in Python. But that answer seems to be saying that there is no way to get a Pearson of other than 1 with just two property values. But that doesn't seem to be correct, since the calculator linked above is doing it. So it may be that for some reason the function pearson() can't do it, but is there another way in Excel, perhaps using VBA?

6 Replies

@anMSuser 

But I found no one word on Chi-Square that p-value is Pearson correlation calculator. Correlation on this site is calculated here https://www.mathsisfun.com/data/correlation-calculator.html, explanation of it is at https://www.mathsisfun.com/data/correlation-calculator.html

@Sergei Baklan 

The two links in your reply are the same. What was the second one supposed to be?

 

And it appears that I was mistaken that the calculator I was using was doing the Pearson coefficient. But it is doing some other kind of correlation coefficient that works on 2-property arrays, whereas apparently Pearson is always 1 or -1 for 2-property arrays. So my question becomes, is there a way to calculate that other correlation coefficient in Excel?

@anMSuser I get the same value for the chi-sq using the proper methodology and function, attached.

@anMSuser 

Sorry, the second link is https://www.mathsisfun.com/data/correlation.html, it just explains the formula.

 

I'm not a specialists in stats. You may check available functions here https://www.excelfunctions.net/excel-statistical-functions.html and here https://support.office.com/en-us/article/statistical-functions-reference-624dac86-a375-4435-bc25-76d... if something works with you.

Okay, I figured it out. @Sergei Baklan was helpful in showing me that the calculation I need is not the same as the Pearson coefficient calculated by pearson(). @Savia was helpful with his attached spreadsheet that showed that the calculation I need is performed by the function ChiSq.test(). But that function is not helpful because it requires me to have a separate array of expected values.

 

I had already written a VBA function that calculates chi-squared by the method shown at Math is Fun. Now I've found that the calculation I need, the one done by the calculator at Math is Fun, is performed by the Excel function ChiSq.dist.RT(). For the two arguments of that function, the first one is the output of my VBA function (using the method at the first link above) and the second argument is 1 (the degrees of freedom = (rows - 1) * (columns - 1) = 1).

 

I have now confirmed that ChiSq.dist.RT() is producing the correct results. Thank you guys for leading me in a direction that allowed me to figure that out.

@anMSuser 

Good to know you found the solution, thank you for the update