Why does R use population variance (n alone in denominator) and not sample variance, (n-1 in denominator)? I just completed the “Variance in R” section in the Learn R course. Is there a command I can use that will take the sample variance rather than the population variance?
According to the R documentation of the
The denominator is used which gives an unbiased estimator of the (co)variance for i.i.d. observations. [
NAwhen there is only one observation […].
I am not sure what it says exactly in the course, but it seems like R uses the sample variance by default.
To get the population variance, you would then just multiply the result of the
var() function by .
I hope this helped you, and all the best,
Thank you very much. I see that the var() function gives sample variance so the lesson is being misleading by saying varIANCE() function gives the population variance. Is the variance() function identical to var()?
This is the lesson.
Thank you for sharing the link, I can see that the
variance() function is a self-defined function by Codecademy (check line 7), and uses to get the variance of a sample, which is indeed the population variance (note that I use to denote the mean), i.e.
You can then get the sample variance from this by multiplying by (the inverse from the previous operation).
variance() function by Codecademy (using the population variance) and the built-in
var() (using the sample variance) are not the same, and if you are in a standard R session, you will only see the latter function.
I hope this helped, and all the best,
Thank you so much! I love R and I find learning it is a great way to prepare for my AP Stats exam! I still wish Codecademy was a bit clearer on this though.
I’m happy that I could help!
I can only agree that R is a great language, and it has great capabilities, so much more than what is shown here on Codecademy (in comparison to a certain competitor)…