FAQ: Statistical Distributions with NumPy - Binomial Distributions, Part II

This community-built FAQ covers the "Binomial Distributions, Part II " exercise from the lesson “Statistical Distributions with NumPy”.

Paths and Courses
This exercise can be found in the following Codecademy content:

Introduction to Statistics with NumPy

FAQs on the exercise _Binomial Distributions, Part II _

Join the Discussion. Help a fellow learner on their journey.

Ask or answer a question about this exercise by clicking reply (reply) below!

Agree with a comment or answer? Like (like) to up-vote the contribution!

Need broader help or resources? Head here.

Looking for motivation to keep learning? Join our wider discussions.

Learn more about how to use this guide.

Found a bug? Report it!

Have a question about your account or billing? Reach out to our customer support team!

None of the above? Find out where to ask other questions here!

I was confused at this point of the path, when getting the histogram of the binomial. In the histogram axis y depicts the probability of the successful outcomes on axis x, right?
but here on axis y, as shown in the screenshot, i got values from 0 to 4000, while i was expecting values from 0 to 1 (probabilities). What could have gone wrong, or am I missing something?
(FYI: the exercise is about an experiment of sending 500 adevertising emails, with probability 5% for somebody to open them and respond. )

I am not a subscriber, so I can’t view the exercise. But, basically

np.random.binomial(500, 0.05, size=10000)

says that you are going to be conducting 10000 experiments.

In a single experiment, you are going to send 500 emails. And there is 5% possibility (probability of 0.05) of an individual email getting a response.

You will end up with an array of results (having 10000 elements), something like [18, 39, 25, ...]. For example in the first experiment, 500 emails were sent and 18 were responded. In the second experiment 500 emails were sent and 39 were responded and so on.

In the histogram, the number of successes are plotted on the x-axis and the frequency on the y-axis. So, for example, there were more than 3000 experiments in which the successes were in the 24-27 range. There were about 1000 experiments in which the successes were in the 16-20 range.
If you sum up the frequencies on the y-axis (i.e. sum up the heights of the bars), the total will be 10000.


You’re also missing a bit of code in the plt.hist() part—the range, number of bins and set normed = True to normalize it.

plt.hist(emails, range=(0, 100), bins=100, normed=True)

thank you! that makes it clearer indeed!
I am trying to figure out why it is different in the example given in the course. For e.g. it has an experiment that a basketball plaayer makes 10 shots and has a probability 30% to score. Then, give a binomial np.random.binomial(10, 0.3, 10000), in the histogram in the y-axis, it shows the probabilities (values 0-1) and not the frequencies of the different outcomes of the experiments as in the example above. I hope I am explaining it with clarity :sweat_smile:
@lisalisaj maybe setting the norm attribute to True, has to do with that?

1 Like

normed=True will normalize it. It does appear though that normed is deprecated and instead the density parameter is suggested. I am not familiar with the versions. @lisalisaj is more familiar with numpy and matplotlib.

1 Like

You’re right; it’s been deprecated. :woman_facepalming: (My old code for that lesson was still in place. Whoops!)

1 Like