What is variance, anyway? In simple terms it is a sort of resolution of two vectors, one moving away from the mean in the negative direction, and the other moving away from the mean in the positive direction. It gives us an idea of how far things spread out around the mean. variance is the deviation squared. It is itself, an area.
Variance will always be a positive number since it is a sum of n squares divided by n. In program terms this implies we must accumulate a sum, then divide the total by the number of terms.
The distance of each value from the mean is central to this calculation. In the end it gives us a distribution metric, the square of sigma. We can plot it as intervals both left and right of the normal (the mean) once evaluated.
z-scores are calculated using sigma (standard deviation). It is these scores we use to calculate area under the normal curve.
Say we have a mean of 66 and a sigma of 11 (meaning variance is 121).
... • • ...
33 44 55 66 77 88 99
This tells us that better than 85% of the class passed the exam (if 50 is a passing grade). In a class of 200, 1 person got 100% (and 1 person got less than 33%). Roughly 30 got better than 77 and slightly fewer failed.
Given a sample space
n = len(N)
x = grades_average(N)
s = 0
for h in N:
s += float(x - h) ** 2
return s / n
See if you might find your errors by comparing the above.