Doubt about the two-sided t-test in a project

https://www.codecademy.com/paths/data-science/tracks/scipy/modules/dspath-hypothesis-testing/projects/familiar

During this project, I found myself questioning one of the statements of the tasks.
First, we use the 1-Sample T-Test to compare ‘vein_pack_lifespans’ to the average life expectancy ‘71’
Then, we check the p value of ‘vein_pack_test’. After that, is proceeds to the following task:

  • We want to present this information to the CEO, Vlad, of this incredible finding. Let’s print some information out! If the test’s p-value is less than 0.05, print “The Vein Pack Is Proven To Make You Live Longer!”. Otherwise print “The Vein Pack Is Probably Good For You Somehow!”

But looking at the documentation of the scipy function ttest_sample1 it says: " This is a two-sided test for the null hypothesis that the expected value (mean) of a sample of independent observations a is equal to the given population mean, popmean ."

From my understanding, if we do a two-sided test and reject the null hypothesis, we can’t be sure in what direction we should look at - is the life expectancy average greater than 71 or less than 71?

1 Like

@kamel, welcome to the forums!

I took a look at the documentation and you are right that ttest_1samp returns a two-tailed p-value and as such, we can’t confidently say that the “vein pack makes you live longer” just because our p-value is less than 0.05. We can reject our null hypothesis, but we need to do a little math to see whether the life expectancy average is greater than or less than 71.

Luckily, it is fairly straightforward to compute the one-sided p-values from the info that ttest_1samp provides us.

If our alternative hypothesis is that the mean is greater than 71, then we reject the null hypothesis when p/2 < alpha and t > 0.

If our alternative hypothesis is that the mean is less than 71, then we reject the null hypothesis when p/2 < alpha and t < 0.

Let’s check this by running the following code:

vein_pack_test = ttest_1samp(vein_pack_lifespans, 71)
print(vein_pack_test)
print(vein_pack_test.pvalue / 2)

Output:
image

As we can see, our t-stat is greater than 0 and p/2 is less than our alpha (0.05).
Now we can conclusively say that the life expectancy average is greater than 71 with the vein pack.

For more info on these calculations, check out the following:
Article from UCLA
Stack Overflow answers here and here

Happy coding!

2 Likes

Thanks for the answer, it’s very clarifying! Can I ask another question based on this context?

I noted that scipy.stats.binom_test is also two-sided, but it doesn’t return a statistic, only p-value. What should we do in the same situation?

If you want to do a one-sided test with binom_test, it’s even easier. Just set the keyword argument alternative to 'greater' or 'less', depending on what your alternative hypothesis is. This is similar to how a one sample ttest is handled in R and I wish they had included this option in scipy.stats.ttest_1samp to make things easier.

If you look at the documentation, it gives a nice example of how you would use this.

2 Likes

Oh, my bad, I didn’t look at the documentation of binom test very clearly. Thank you very much!
And yeah, I also think the scipy.stats.ttest_1samp should have the same option to make things easier.

1 Like

This topic was automatically closed 41 days after the last reply. New replies are no longer allowed.