I’m going to borrow a bit from a medical example and hopefully it will help.

Let’s assume two people come in for cancer screening. Ted does not have cancer and Bill does.

Type I error would be telling Ted that he DOES have cancer when he does NOT; this is also called a false positive; he doesn’t have a disease you told him he does. Type II error would be missing Bill’s cancer; this is called false negative because we tell him he DOES NOT have colon cancer when he DOES.

The null hypothesis is more difficult. Generally, we assume that there are a certain number of Bills and Teds in the world and that a percentage of each of them will develop cancer. We don’t have time sample all of them so let’s say we have a thousand Bills and a thousand Teds to represent both groups. The null hypothesis would (as is always the case by convention) be that Bills and Teds have the exact same risk of developing cancer and that there is no difference between a Bill or a Ted. The tests we are learning about help us either accept that this is true or reject it (“accepting/rejecting the null”).

The intersect function cross references one set of data to evaluate whether any of the values in that data set exist within the data set your intersecting with (A contains 1, 2 and B contains 2, 3. Intersect evaluates that 2 is in both sets). It is a function that the people at codecademy created for you and the output you can generate via (SPOILER):

type_i_error = intersect([measured value], [actual value])

will allow you to see which of those individuals who were positive in the experiment (we told them they have cancer) were actually negative (they do not have cancer) in “real life” and those are individuals who represent false positives (which is type I error). The reverse is true for false negatives.

Hope this helps.