In this exercise, we expect the population mean to be 30 but the mean of our sample is 31. So wouldn’t our null hypothesis be: the sample represents a population of mean 31?
We then go on and use
test_1samp(ages, 30) to find the p-value. I am unclear about these steps.
What are we checking for in this hypothesis test? Are we checking If the sample represents a population of mean 30? If so, if it is less than 0.05 then does it mean it represents a population of mean 30?
First, let’s note that the null hypothesis is usually the status quo. If we expect that the population mean is 30, this is the status quo and this is why our null hypothesis is
The set of samples belongs to a population with the target mean of 30
test_1samp(ages, 30), we are testing the likelihood that the samples that we have in
ages were taken/drawn from a distribution with mean 30. We could of course have just gotten somewhat unlucky with our sampling in this case, especially since the number of samples for
ages is small. If the resulting p-value is less than 0.05, we will reject the null hypothesis, meaning that we’re saying it is unlikely that the sample was drawn from a distribution with mean 30. A p-value greater than or equal to 0.05 means that we fail to reject the null hypothesis, meaning that we cannot be confident that the samples were not drawn from a distribution with mean 30.