Forum Discussion
Excel 2007 average discrepency
Re: ``All the cells were filled with numbers & when I did the above all the results were the same. But when I started deleting values in some of the cells the 3 results began to differ``
That is right. And the issue has nothing to do with Excel. It has to do with mathematics.
In general, the average of averages of the parts is not the same as the average of the whole. And the correct answer is the average of the whole.
Alternatively, we can calculate a weighted average, which is the sum of each partial average times a weighting factor. Each weighting factor is the number of values in the part divided by the number of values in the whole.
The exception is: when each part has the same number of values, then the average of the partial averages equals the average of the whole.
Math and algebra might not be your thing. But perhaps the following will make sense to you, anyway.
Consider two partial averages: (10+20+30)/3 and (40+50)/2.
The average of the two partial averages is: ( (10+20+30)/3 + (40+50)/2 ) / 2
That reduces to ( 2*(10+20+30) + 3*(40+50) ) / 12, where 12 = (3*2)*2.
In contrast, the average of the whole is (10+20+30+40+50)/5.
If the difference is not apparent, calculate these two values:
=AVERAGE(AVERAGE(10,20,30), AVERAGE(40,50))
which is the same as
=SUM( SUM(10,20,30)/3, SUM(40,50)/2 ) / 2
versus
=AVERAGE(10,20,30,40,50)
which is the same as
=SUM(10,20,30,40,50)/5
- BettypinApr 13, 2021Copper Contributor
JoeUser2004 Thanks for the reply. Even though it's been 50 years since my last algebra class your answer was very clear & understandable. I Googled "average of averages" & found some fascinating websites that dealt with this subject & learned about Simpson's Paradox. I was pretty sure it wasn't a problem with Excel but because of another problem with blank cells I wasn't positive. When I first d/l the data in a spreadsheet & sort from high to low by any column that has blank cells it always puts the blank cells in the top so I have to block mark the empty cells & hit delete. Then when I re-sort the high row will be at the top & the blank cells at the bottom as should be. Try as I might I can't find a way to figure out what character is in the cells so I can remove them from the entire sheet all at once.
Thank you once again
- JoeUser2004Apr 13, 2021Bronze Contributor
Re: ``When I first d/l the data in a spreadsheet & sort from high to low by any column that has blank cells it always puts the blank cells in the top [....] can't find a way to figure out what character is in the cells so I can remove them``
I suspect the "blank" (i.e. blank-appearing) cells contain null strings or non-breaking spaces (ASCII 160; HTML nbsp) that you might have copy-and-pasted-value.
Confirm that formulas of the form =ISTEXT(A1) return TRUE.
If those are the only text in the range, you can delete them with the following procedure.
1. Select the range.
2. Press function key f5 (Go To), and click Special.
3. Select Constants, and deselect Numbers, Logicals and Errors. That is, select only Text.
4. Click OK.
5. Confirm that only the blank-appearing cells are selected. If so, press Delete.Caveat: This runs the risk of deleting legitimate text inadvertently. It would be prudent to make a backup copy of the file first, so that you have something to fall back to if and when you discover any mistaken deletions when it is too late to undo (press ctrl+z).