FAQ: Hypothesis Testing with R - Type I and Type II Errors

This community-built FAQ covers the “Type I and Type II Errors” exercise from the lesson “Hypothesis Testing with R”.

Paths and Courses
This exercise can be found in the following Codecademy content:

Learn R

FAQs on the exercise Type I and Type II Errors

There are currently no frequently asked questions associated with this exercise – that’s where you come in! You can contribute to this section by offering your own questions, answers, or clarifications on this exercise. Ask or answer a question by clicking reply (reply) below.

If you’ve had an “aha” moment about the concepts, formatting, syntax, or anything else with this exercise, consider sharing those insights! Teaching others and answering their questions is one of the best ways to learn and stay sharp.

Join the Discussion. Help a fellow learner on their journey.

Ask or answer a question about this exercise by clicking reply (reply) below!
You can also find further discussion and get answers to your questions over in Language Help.

Agree with a comment or answer? Like (like) to up-vote the contribution!

Need broader help or resources? Head to Language Help and Tips and Resources. If you are wanting feedback or inspiration for a project, check out Projects.

Looking for motivation to keep learning? Join our wider discussions in Community

Learn more about how to use this guide.

Found a bug? Report it online, or post in Bug Reporting

Have a question about your account or billing? Reach out to our customer support team!

None of the above? Find out where to ask other questions here!

I don’t understand how this exercise demonstrates type i and type ii errors.

Which set of data is the hypothesis and which is the actual data? To me, the actual data would be the data you collect in the experiment. Which of these is the “null hypothesis”?

Can someone clear this up for me?

Thank you!

2 Likes

I would also like some clarity. Thanks for asking this question.

In hypothesis testing the goal is to either accept or reject the null hypothesis (find it true or false). Sometimes DA’s or DS’s will incorrectly accept or reject the null. This is what is called a Type I error (false positive, when you incorrectly reject a null that was actually true), or, Type II error (false negative, when you incorrectly accept a false null).
So, if your confidence interval is 95%, you have a 5% probability of a false positive result.

This might help:
https://medium.com/analytics-vidhya/hypothesis-testing-type-1-and-type-2-errors-bf42b91f2972

Or, this:
https://math.stackexchange.com/questions/3691376/minimizing-type-i-and-type-ii-errors-simultaneously

And, my favorite explanations of hypothesis testing and stats from Mr. Nystrom:
Mr. Nystrom, stats teacher

This exercise is pretty poorly made in my opinion but hopefully, I can help clear it up a bit.

In the hypothetical experiment the exercise has you work with (NOT the volleyball experiment) there are 49 individuals that may be positive or negative for some trait. Each individual is assigned a number.
Thus, the vector actual_positives
c(2, 5, 6, 7, 8, 10, 18, 21, 24, 25, 29, 30, 32, 33, 38, 39, 42, 44, 45, 47) contains all the individuals that are positive for the trait, and the vector acutal_negatives
c(1, 3, 4, 9, 11, 12, 13, 14, 15, 16, 17, 19, 20, 22, 23, 26, 27, 28, 31, 34, 35, 36, 37, 40, 41, 43, 46, 48, 49) contains all individuals that are negative for the trait. As an example, individual #27 is not positive for the trait and is therefore listed in the second vector but not the first.

The hypothetical experiment tests all 49 individuals and gives us the experimental_positive/negative vectors as a result. The exercise asks us to find errors using the intersect() function. Recall that a type i error is finding a difference when in actuality there is none, so our “type I errors” here are instances where the experiment finds a positive when there should be a negative. We use the function intersect(actual_negatives,experimental_positives) to identify these false positives and vice versa for the false negatives or type II errors.

Now, you may be wondering what this has to do with null hypotheses, and the answer is nothing! This exercise has nothing to do with comparisons of mean or other statistical aggregates that we use when testing hypotheses. Furthermore, while it may be (and I’m not even sure about this) technically true that misidentifying a single negative in your sample as a positive counts as a type I error, that kind of language is usually reserved for talking about hypotheses and data analysis.

I really hope the creators of this lesson tweak this exercise, as I think it does more to confuse than educate users on the material.