Hey all,
I’m doing a Data Science project (in Python 3), and in it I am using the Binomial Test. I just want to confirm that it’s the appropriate test, and that I’m using it correctly.
The project looks at death rates in the population (of the US) over the last twenty years. I’m taking each year’s population to be a sample, the number of deaths as ‘successes’ (grim, I know, but I’m using the terminology! Heh), and the average death rate (calculated by dividing the number of deaths by the population size and averaging them) to be the probability of ‘success,’ i.e. the expected probability of success for the population. Since a ‘trial’ in the sample (a person in the population) either dies or not, I’ve considered ‘death’ to be a binary categorical variable, and so I’ve chosen the Binomial Test to see if the number of deaths of another year is significant.
- Null Hypothesis: The number of deaths in the new year is a sample from a population with mean M.
- Alternative Hypothesis: The number of deaths in the new year is not a sample from a population with a mean M,
where M is the average death rate for the past twenty years.
So, using scipy.stats.binom_test, this is (basically) how I’ve filled out the function:
binom_test(deaths_in_the_new_year, n=population_in_the_new_year, p=mean_death_rate_of_previous_years, alternative='greater')
Is this right? Have I chosen this test and its parameters appropriately? Am I making any mistakes in my method?