What is a Standard Deviation?


#1

Question

What is a Standard Deviation?

Answer

Standard Deviation is a term used in Statistics to tell how much a set of values is dispersed.

The value of the Standard Deviation is equal to the square root of the Variance, which is used to tell how much a set of numbers are spread out from the mean value.

To obtain these values, let’s take a step-by-step example. Let’s say we had some values representing heights of objects in centimeters.
heights = [120, 130, 180, 120, 100]

First, we calculate the mean, or average, of this data set.

mean = sum(heights) / 5
# 650 / 5
# The mean is 130 

Next, we must get the “difference of each element from this mean, square each difference, and obtain the sum”.

# First, get difference of each value from the mean
# 120-130=-10, 130-130=0, ...
differences = [-10, 0, 50, -10, -30]

# Next, square each difference 
squared_differences = [100, 0, 2500, 100, 900]

# Finally, get the sum of these squared differences
sum_of_squared_differences = sum(squared_differences) # 3600

Finally, we divide this sum of squared differences by the number of elements, to get the Variance. The Standard Deviation is then the square root of this value.

variance = sum_of_squared_differences / 5 
# The variance is 720

standard_deviation = math.sqrt(variance)
# 26.83

What the Standard Deviation also can help us understand is how much data can be seen within ranges of values. In a normal distribution:

68% of the data lies within 1 Standard Deviation of the mean,
(130 - 26.83, 130 + 26.83)

95% of the data lies within 2 Standard Deviations of the mean,
(130 - (26.83 * 2), 130 + (26.83 * 2))

and approximately
99.7% of the data lies within 3 Standard Deviations of the mean.
(130 - (26.83 * 3), 130 + (26.83 * 3))


#2

Beg to question the terminology. In programming a set is discrete. A sample space on the other hand need not be discrete at all, or sorted. It’s from this we get frequency tables, medians (sorted), modes, mins, maxs and means.


#3

Thanks for pointing that out! Didn’t mean to specify it as an actual set of discrete items as used in mathematics, so it was a miswording on my part.


#4

That’s cool!. Hope I haven’t trampled…


#5

Not at all! Any feedback and insight on the posts are much appreciated.


#6

To really get a picture of what standard deviation is, we need to examine the curve that it is directly related to, the standard normal curve. We know that any function or relations can be graphed, and this one looks like a bell; ergo, it is commonly known as the bell curve.

The x-axis is arbitrarily broken into eight segments, but as this is theoretical, the last segment is a limit. It can never equal or be less than negative four, and it can never equal of be greater than four. If we call this arbitary value z we can write,

{z | -4 < z < 4; z is Real}

When we line up our sorted sample in row, they all correspond with a z-score which directly corresponds with a point on the curve (either side of the mean).

The bell curve is how a lot of universities grade. Pity the ones who end up ranked less than the peers they are equals of.