I’ve learned for some extent about hypothesis testing in the last few days, so I would like to share what I understand about the t-test. If I make any mistakes, please let me know.
The 1-sample t-test ttest_1samp(exampleDistribution, expectedMean)
is used to test whether the null hypothesis
The population mean equals to expectedMean
.
is rejected or not. To that end, this function calculates the following statistic value called t-statistic:
T = (X - expectedMean) / (S / np.sqrt(n))
Here, X
, S
, and n
represent sample mean, sample standard deviation, and sample size, respectively.
If the null hypothesis is true, the probability of the value of T
will follow a probability distribution called the t-distribution:
Now consider the actual sample exampleDistribution
we observed. Calculate the sample mean x = np.mean(exampleDistribution)
, the sample standard deviation s = np.std(exampleDistribution, ddof=1)
, and the sample size n = len(exampleDistribution)
of exampleDistribution
. So we get the t-statistic t
of this sample:
t = (x - expectedMean) / (s / np.sqrt(n))
The p-value is the conditional probability that T
has a value of t
or more extreme under the assumption that the null hypothesis is true. (Note that it is NOT the “probability that the null hypothesis is true.”) For a two-sided test, the p-value is the area of colored parts in the following image:
If the p-value is lower than the significance level, which is 0.05 here, it means that “observing such a result would only occur with a probability of less than 5% if the null hypothesis were true.” This is an enough reason to doubt the null hypothesis. So we will reject the null hypothesis. However, the p-value is not zero, so it is not zero probability that such a rare observation occurred by chance this time. There is always a risk of Type I errors.
If the p-value is greater than the significance level, we cannot reject the null hypothesis. It should be noted here that this does not necessarily mean that the null hypothesis is true. This means, “no evidence has found to reject the null hypothesis, and it may be true or false, we don’t know”. There is always a risk of Type II errors.