What is a Standard Deviation?
Standard Deviation is a term used in Statistics to tell how much a set of values is dispersed.
The value of the Standard Deviation is equal to the square root of the Variance, which is used to tell how much a set of numbers are spread out from the mean value.
To obtain these values, let’s take a step-by-step example. Let’s say we had some values representing heights of objects in centimeters.
heights = [120, 130, 180, 120, 100]
First, we calculate the mean, or average, of this data set.
mean = sum(heights) / 5 # 650 / 5 # The mean is 130
Next, we must get the “difference of each element from this mean, square each difference, and obtain the sum”.
# First, get difference of each value from the mean # 120-130=-10, 130-130=0, ... differences = [-10, 0, 50, -10, -30] # Next, square each difference squared_differences = [100, 0, 2500, 100, 900] # Finally, get the sum of these squared differences sum_of_squared_differences = sum(squared_differences) # 3600
Finally, we divide this sum of squared differences by the number of elements, to get the Variance. The Standard Deviation is then the square root of this value.
variance = sum_of_squared_differences / 5 # The variance is 720 standard_deviation = math.sqrt(variance) # 26.83
What the Standard Deviation also can help us understand is how much data can be seen within ranges of values. In a normal distribution:
68% of the data lies within 1 Standard Deviation of the mean,
(130 - 26.83, 130 + 26.83)
95% of the data lies within 2 Standard Deviations of the mean,
(130 - (26.83 * 2), 130 + (26.83 * 2))
99.7% of the data lies within 3 Standard Deviations of the mean.
(130 - (26.83 * 3), 130 + (26.83 * 3))