# Question

Would you please provide a visual explanation of what the 1 sample t-test, `ttest_1samp(exampleDistribution, expectedMean)`, is doing with its arguments?

Relating the behavior of our 1 sample t-test back to this applet, think of `ttest_1samp` as first taking the list of values, `exampleDistribution`, and turning that into a distribution. Once we have that distribution, we can compute a mean. You can think of this as one of the two distributions in the applet. Next, form a perfect normal distribution with mean `expectedMean` and a fixed standard deviation equal to that of the first distribution; this is your second distribution in the applet. The 1 sample t-test is then computing the exact p-value, or something similar, that we see at the bottom of the applet.

2 Likes

Unfortunately the applet is not working on my computer (changing values does nothing - not sure why. Running an up-to-date chrome browser on a 2018 macbook pro).

Anyway, I canâ€™t seem to understand how it is possible to provide a p-value without having a parameter that specifies a population size?

As I had understood it, the confidence in a null-hypothesis for a given sample depends on the population size considered, right? So if my sample is n=5 but my population is n=5, the probability of a null-hypothesis happening should be 1, right? If the population goes up, the null-hypothesis goes down.

In the formula` ttest_1samp`, there is no parameter for population size. So I canâ€™t seem to understand how it works?

Any help would be appreciated.

As far as I understand is that scipy handles this for you â€śin the backgroundâ€ť. This means that it takes your list of values (observations) and derives from it a t-distibution (students distibution). For instance, the number of observations are necessary to determine the degrees of freedom (shape) of the t-distribution. Once you have the shape of the t-distribution you can derive its mean from it and compare it with your expected mean (null value). By knowing the â€ślocationâ€ť of both means, scipy derives the confidence interval from it. And when your expected mean (null value) falls within a 95%confidence interval (which results in a p-value grater than 5%) you can assume that your null hypothesis is â€śmost probablyâ€ť correct.

I know, probably not the best explanation (and potentialy flawed), but I am still trying to wrap my head around this as well.

1 Like

Still canâ€™t understand this. A suggestion to Codecademy, please provide us with more practice and real life examples

8 Likes

Iâ€™ve learned for some extent about hypothesis testing in the last few days, so I would like to share what I understand about the t-test. If I make any mistakes, please let me know.

The 1-sample t-test `ttest_1samp(exampleDistribution, expectedMean)` is used to test whether the null hypothesis

The population mean equals to `expectedMean`.

is rejected or not. To that end, this function calculates the following statistic value called t-statistic:

``````T = (X - expectedMean) / (S / np.sqrt(n))
``````

Here, `X`, `S`, and `n` represent sample mean, sample standard deviation, and sample size, respectively.

If the null hypothesis is true, the probability of the value of `T` will follow a probability distribution called the t-distribution:

Now consider the actual sample `exampleDistribution` we observed. Calculate the sample mean `x = np.mean(exampleDistribution)`, the sample standard deviation `s = np.std(exampleDistribution, ddof=1)`, and the sample size `n = len(exampleDistribution)` of `exampleDistribution`. So we get the t-statistic `t` of this sample:

``````t = (x - expectedMean) / (s / np.sqrt(n))
``````

The p-value is the conditional probability that `T` has a value of `t` or more extreme under the assumption that the null hypothesis is true. (Note that it is NOT the â€śprobability that the null hypothesis is true.â€ť) For a two-sided test, the p-value is the area of colored parts in the following image:

If the p-value is lower than the significance level, which is 0.05 here, it means that â€śobserving such a result would only occur with a probability of less than 5% if the null hypothesis were true.â€ť This is an enough reason to doubt the null hypothesis. So we will reject the null hypothesis. However, the p-value is not zero, so it is not zero probability that such a rare observation occurred by chance this time. There is always a risk of Type I errors.

If the p-value is greater than the significance level, we cannot reject the null hypothesis. It should be noted here that this does not necessarily mean that the null hypothesis is true. This means, â€śno evidence has found to reject the null hypothesis, and it may be true or false, we donâ€™t knowâ€ť. There is always a risk of Type II errors.

16 Likes

I encourage you to read @object2161442840 answer because itâ€™s well fleshed out, but hereâ€™s a TL;DR
the test assumes that the total population is distributed around the expected mean in the same way that the observed sample is.

Once it has assumed the shape and standard deviation of the total population, it can plot the probability of the observed sample mean falling at least as far away from the expected mean.

If youâ€™d like a little more color around it, I recommend the book Naked Statistics, itâ€™s a surprisingly pleasant read!

1 Like