How does the Central Limit Theorem help solve problems related to skewed sample means?


How does taking a larger number of samples solve the issue of a skewed sample mean? How does the Central Limit Theorem help here?


The Central Limit Theorem (CLT) is, roughly, the following statement

Regardless of the distribution of our data, if we take a large number of samples of a fixed size and plot the sample statistic which we care about (e.g. mean or standard deviation) the distribution of the resulting plot will be roughly normal, i.e. a bell curve.

Note: The distribution that we get from plotting our sample statistics is called a sampling distribution.

We can see the truth of this claim, experimentally, by playing with the applet in this lesson.

Okay. So how does this help us solve the issue of a skewed sample? The CLT helps because it turns out that the mean of our sampling distribution will become arbitrarily close to the mean of our original distribution as we take more and more samples. This is great because we often never know the mean of the original distribution. The CLT gives us a mathematical assurance that we can calculate it from taking samples (which we can directly calculate the mean for).

In conclusion, if we have a skewed sample mean, by

  1. taking a larger number of samples,
  2. plotting the mean of each sample, and
  3. taking the mean, call it M, of the resulting distribution

M is likely to be close to the population mean by the Central Limit Theorem.


Will the sample distribution of the resulting plot be roughly normal, if the population distribution is not normal? If not, then what does the theorem state: ‘roughly normal’ mean?


Regardless of what is the distribution of population or of the sample, we are here concerned about the distribution of statistic that can be mean, standard deviation, etc.
If we take significantly large samples it is obvious that we will get a statistic which is close to value of statistic calculated over entire population.
It’s value be close to the population statistic, somewhat less or more or even approximately equal.
Therefore if we will plot a histogram, population statistic will serve as a mean with a small standard deviation from the mean.

1 Like

With the Central Limit Teorem you group samples and calculate a sum or a mean. The larger the group, the more “Normal” the data will be.

The classical example is the dice throw:

  • 1 Dice: Uniform distribution.

  • 2 Dices: it begins to show a peak in the middle. If you are adding the results, the middle is around 7.

  • 1000 Dices: the shape of the distribution is Normal.