What is a Standard Deviation?


What is a Standard Deviation?


Standard Deviation is a term used in Statistics to tell how much a set of values is dispersed.

The value of the Standard Deviation is equal to the square root of the Variance, which is used to tell how much a set of numbers are spread out from the mean value.

To obtain these values, let’s take a step-by-step example. Let’s say we had some values representing heights of objects in centimeters.
heights = [120, 130, 180, 120, 100]

First, we calculate the mean, or average, of this data set.

mean = sum(heights) / 5
# 650 / 5
# The mean is 130 

Next, we must get the “difference of each element from this mean, square each difference, and obtain the sum”.

# First, get difference of each value from the mean
# 120-130=-10, 130-130=0, ...
differences = [-10, 0, 50, -10, -30]

# Next, square each difference 
squared_differences = [100, 0, 2500, 100, 900]

# Finally, get the sum of these squared differences
sum_of_squared_differences = sum(squared_differences) # 3600

Finally, we divide this sum of squared differences by the number of elements, to get the Variance. The Standard Deviation is then the square root of this value.

variance = sum_of_squared_differences / 5 
# The variance is 720

standard_deviation = math.sqrt(variance)
# 26.83

What the Standard Deviation also can help us understand is how much data can be seen within ranges of values. In a normal distribution:

68% of the data lies within 1 Standard Deviation of the mean,
(130 - 26.83, 130 + 26.83)

95% of the data lies within 2 Standard Deviations of the mean,
(130 - (26.83 * 2), 130 + (26.83 * 2))

and approximately
99.7% of the data lies within 3 Standard Deviations of the mean.
(130 - (26.83 * 3), 130 + (26.83 * 3))


Beg to question the terminology. In programming a set is discrete. A sample space on the other hand need not be discrete at all, or sorted. It’s from this we get frequency tables, medians (sorted), modes, mins, maxs and means.

1 Like

Thanks for pointing that out! Didn’t mean to specify it as an actual set of discrete items as used in mathematics, so it was a miswording on my part.


That’s cool!. Hope I haven’t trampled…

1 Like

Not at all! Any feedback and insight on the posts are much appreciated.


To really get a picture of what standard deviation is, we need to examine the curve that it is directly related to, the standard normal curve. We know that any function or relations can be graphed, and this one looks like a bell; ergo, it is commonly known as the bell curve.

The x-axis is arbitrarily broken into eight segments, but as this is theoretical, the last segment is a limit. It can never equal or be less than negative four, and it can never equal of be greater than four. If we call this arbitary value z we can write,

{z | -4 < z < 4; z is Real}

When we line up our sorted sample in row, they all correspond with a z-score which directly corresponds with a point on the curve (either side of the mean).

The bell curve is how a lot of universities grade. Pity the ones who end up ranked less than the peers they are equals of.


How exactly were the percentages; 68, 95, and 99.7 calculated? Are those specific to this example or are they reserved for all general cases. In other words… 68% of the data lies within 1 Standard Deviation of the mean no matter what set of data you use?

Under the standard normal curve we compute the area of each given that the area of the total is 100%. There are four segments of equal width on each side of the normal. The left most and right most segments are very small (have very low y values) hence they make up in total, 0.3% of the total area. Moving toward the normal the next two segments combined make up 4.7% of the area. The next two segments combined account for 27%. The segments on either side of the normal account for 68% of the area.

Illustrations of the SNC will never have the number 4 since it is a limit and therefore asymptotic. The normal curve cannot technically cross the x-axis. And since the net total area under that section of the curve between ~-4 and -3, and 3 and ~4 is so small, it can be disregarded as statistics is concerned. We’re talking a probability curve, not a train schedule.