FAQ: Hypothesis Testing - Tukey's Range Test

This community-built FAQ covers the “Tukey’s Range Test” exercise from the lesson “Hypothesis Testing”.

Paths and Courses
This exercise can be found in the following Codecademy content:

Data Science

FAQs on the exercise Tukey’s Range Test

There are currently no frequently asked questions associated with this exercise – that’s where you come in! You can contribute to this section by offering your own questions, answers, or clarifications on this exercise. Ask or answer a question by clicking reply (reply) below.

If you’ve had an “aha” moment about the concepts, formatting, syntax, or anything else with this exercise, consider sharing those insights! Teaching others and answering their questions is one of the best ways to learn and stay sharp.

Join the Discussion. Help a fellow learner on their journey.

Ask or answer a question about this exercise by clicking reply (reply) below!

Agree with a comment or answer? Like (like) to up-vote the contribution!

Need broader help or resources? Head here.

Looking for motivation to keep learning? Join our wider discussions.

Learn more about how to use this guide.

Found a bug? Report it!

Have a question about your account or billing? Reach out to our customer support team!

None of the above? Find out where to ask other questions here!

How could i use the Tukey’s range test on higher dimensional data-sets (such as 4 dimensional ones)?

1 Like

What happens with the output:

Multiple Comparison of Means - Tukey HSD,FWER=0.05

group1 group2 meandiff lower upper reject

a b 7.2767 3.2266 11.3267 True
a c 4.0115 -0.0385 8.0616 False
b c -3.2651 -7.3152 0.7849 False

Do we reject the nullhypothesis for a? or does it just mean the mean for a is very different from b and c?

Turkey’s Range Test will return a table of information, telling you whether or not to reject the null hypothesis for each pair of datasets. Will we reject the null hypothesis when it is False or when it is true?

1 Like

Does meandiff mean the difference between both means? Also after printing out the tukey hsd table, what does the long number (0.00015…) on top of the table represent? Is it a probability?
image

It is a pval (P value) which is the result of ANOVA test.

I was unsure about this as well. After poking around on Google, I think it means there is significant difference between a and b, but not the other two pairs of datasets.

2 Likes

I believe the results of the tukey test to mean you’d reject the null hypothesis for the a,b pair but not the others because there isn’t a large enough difference between them.

My problem with understanding here is that the null hypothesis we’re testing is “these data sets have the same mean”, however there is still a difference in means between A and C so that null hypothesis should be rejected as well.

The means of A and C are 58 and 62, respectively. When you run a 2 sample t-test between A and C you get a p-value of 0.021 which means the null hypothesis should be rejected. Although Tukey’s Range Test does accurately point out that B is the most significantly different (mean of 65) compared to A, I don’t understand why the null hypothesis isn’t rejected in all cases. :neutral_face:

1 Like

Why is it necessary to use the len() function multiplied by the label string? (labels = [‘a’] * len(a) + [‘b’] * len(b) + [‘c’] * len©) This is not addressed in the lesson, as is usual for Codecademy.

Also, are other post-hoc tests available? Like a Bonferroni?

Question: Why does changing b to b_new change the evaluation of the null hypothesis for a vs c?

Background:
I tried to check the tukey range test for (a, b, c) AND for the previously mentioned (a, b_new, c).
The mean values for each of the data sets are as follows.
a = 58.3
b = 65.6
b_new = 148.4
c = 62.4

I get the following results for each test.

(a, b, c)

(a, b_new, c)

Question, repeated: Why does changing b to b_new change the evaluation of the null hypothesis for a vs c?

2 Likes

Hi,
I have the same problem understinding why tukey range test result does not reject a and c test while t_test two sample test shows that the pvalue is 0.0210 which is more than enough to reject the null hypothesis of A and C

Woww this is another food for a thought. it is becoming more confusing

The video for the Fetchmaker project in Hypothesis Testing seems to be wrong or confusing at this moment -> https://youtu.be/vHjJGTSVIFQ?t=558
Matt seems to mistake the combination that cannot reject the null hypothesis (he says pitbull and terriers have similar weights, whereas it seems to me that it should be pitbull and whippets).

Can somebody clarify the explanation on how to read a Tukeyhsd result ? The cheatsheet and lesson are not super clear about this and the statsmodel documentation either, it seems.

1 Like

It’s addressed later in the Fetchmaker projects or maybe in the video. It is supposed to complete the concatenated values so that values for one group are not mistaken for another group. However it would be much safer to be able to pass zipped data or dictionnaries I believe.

Did you get any explanation to your doubt?

Sorry, but I didn’t get any explanation

I had the same question. I believe that Matt has made a mistake. Pitbulls and Whippets are similar in weight i.e the ( last column ) is set to false ( greater than 0.05 ) whereas the other 2 combinations are true, so pitbulls and terries are significantly different in weight and therefore are not similar HOWEVER, Matt incorrectly suggests that they are close in weight which is not correct per the video ( slow down Matt your too quick!! ).

1 Like

I’m have the exact same issue. Ideally based on the 2 sample T-Test result between A and C, the null hypothesis for these two should be rejected. Maybe there are other measures taken into account ? Has anyone found any answers ?

Please somebody explain the Output of Tukey Range Test. On what basis is it Rejecting and Accepting. @mtf @thepitycoder Please enlighten us

2 Likes

Perhaps at the significance level of 0.05, we cannot have a more conclusive conclusion other than “there is a significant difference between a and b”. Performing an additional 2-sample t-test in addition to Tukey’s Range Test may be equivalent to raising the overall significance level (increasing the risk of type I error and reducing the risk of type II error), I think. So it may be equivalent to perform the Tukey’s Range Test with a raised significance level.

For example, if we raise the significance level from 0.05 to 0.06:

tukey_results = pairwise_tukeyhsd(v, labels, 0.06)
print(tukey_results)

we have the following result:

Multiple Comparison of Means - Tukey HSD,FWER=0.06
=============================================
group1 group2 meandiff  lower   upper  reject
---------------------------------------------
  a      b     7.2767   3.3534 11.1999  True 
  a      c     4.0115   0.0883  7.9347  True 
  b      c    -3.2651  -7.1883  0.6581 False 
---------------------------------------------

In this case, we can conclude that a is significantly different from b and c.

We can choose whether we dislike the risk of Type I errors and accept a weak conclusion (there is a risk of Type II errors), or we want a stronger conclusion even if the risk of Type I errors is increased. I think it is up to the person doing the test. There is a trade-off between the risk of Type I error and the risk of Type II error.

However, I’m not an expert, so I don’t know if this raising the significance level is an accepted way in science. Perhaps there is a better way. Welcome comments from everyone.