Data Science Path: Hypothesis Testing. Concerns over 'Familiar: A Study In Data Analysis'

So I am a little concerned with how codecademy approached presenting this specific problem for a number of reasons:

  1. I believe the thought process behind part 10 & 11 is incorrect. Don’t the p-values show the similarity between groups, and they don’t point in a certain directionality? If it was below 0.05 that would say they are sufficiently different but not in which way. If you actually look at the mean for the artery pack, you will see it is lower than that of the vein. I feel that making a statement/decision about the efficacy of something which is based solely off the p-value is reckless.

  2. Furthermore, I think saying that something isn’t significantly different based on p-value that is very close to .05 is misleading. In fact the two samples from parts 10 & 11 have a high chance that they are different. If you print out the p-value you will see it’s roughly .06 which indicates that this difference has a 94% likelihood of not being due to pure chance. Whether or not they are considered different by you and your employer should be the result of how much risk you are willing to take. Is 94% confidence really that different than 95%? I believe this is sending the wrong message that the p-value is this magical statistic that important decisions can be made by, even without considering the circumstantial evidence.

  3. Lastly, I believe part 16 confuses correlation and causation. The statement ‘the artery package is proven to make you healthier!’ sounds like you are drawing a causal connection between the artery pack and healthier outcomes. In fact, all we’ve shown is that the people who use the artery pack are associated with higher levels of iron, not that the artery pack actually resulted in higher iron levels. To determine that (and make the claim above) you’d have to run a random control trial.

I am not really sure if this is the place to voice my concerns over the problem but I wanted to run it by people on here before I actually bring it to anyone’s attention at codeacdemy. Am I right here or am I crazy? I know it’s very intro level stuff but in the help video for the problem, the instructor didn’t even mention all the assumptions we were making throughout the problem.



It would be helpful to post the link to the problem you’re referencing. I’ll generalize:
p-values are expected to address a particular hypothesis (the null), and the person or team running the test sets the confidence level – risk acceptance as you imply in paragraph 2. You have to draw the line somewhere, and if you research p-values and null hypotheses, you’ll find that .05 is the rule-of-thumb; it’s not authoritative.

Here is my code.

import familiar

import scipy.stats

vein_pack_lifespans = familiar.lifespans(package=‘vein’)

from scipy.stats import ttest_1samp

vein_pack_test, pval = ttest_1samp(vein_pack_lifespans, 71)

print pval

if pval < 0.05:

print(“The Vein Pack Is Proven To Make You Live Longer!”)

else: print(“The Vein Pack Is Probably Good For You Somehow!”)

artery_pack_lifespans = familiar.lifespans(package=‘artery’)

from scipy.stats import ttest_ind

package_comparison_results = ttest_ind(vein_pack_lifespans, artery_pack_lifespans)


if (package_comparison_results.pvalue < 0.05):

print(“the Artery Package guarantees even stronger results!”)

else: print(“the Artery Package is also a great product!”)

iron_contingency_table = familiar.iron_counts_for_package()


from scipy.stats import chi2_contingency

chi2, iron_pvalue, dof, expected = chi2_contingency(iron_contingency_table)

print chi2, iron_pvalue, dof, expected

if iron_pvalue < 0.05:

print(“The Artery Package Is Proven To Make You Healthier!”)

else: print(“While We Can’t Say The Artery Package Will Help You, I Bet It’s Nice!”)