FAQ: Statistical Distributions with NumPy - Binomial Distributions and Probability

This community-built FAQ covers the “Binomial Distributions and Probability” exercise from the lesson “Statistical Distributions with NumPy”.

Paths and Courses
This exercise can be found in the following Codecademy content:

Data Science

Introduction to Statistics with NumPy

FAQs on the exercise Binomial Distributions and Probability

Join the Discussion. Help a fellow learner on their journey.

Ask or answer a question about this exercise by clicking reply (reply) below!

Agree with a comment or answer? Like (like) to up-vote the contribution!

Need broader help or resources? Head here.

Looking for motivation to keep learning? Join our wider discussions.

Learn more about how to use this guide.

Found a bug? Report it!

Have a question about your account or billing? Reach out to our customer support team!

None of the above? Find out where to ask other questions here!

In this exercise, I’m getting 0.0 for the probabilities of both the no_emails and b_test_emails, yet the answers are showing as correct. Is this a bug or is this in fact the correct answer? Code as follows:

import numpy as np

emails = np.random.binomial(500, 0.05, size=10000)
no_emails = np.mean(emails == 0)

b_test_emails = np.mean(emails == 0.08)

print no_emails
print b_test_emails

Seems like a bug to me. There are quite a few on this series.

Regarding no_emails, it is due to the very small probability that emails contains at least one 0. If the probability that a recipient of one email open it is 0.05, then the probability of sending 500 emails and being opened by nobody is (1 - 0.05) ** 500. Therefore, the probability that 0 will appear even once in 10000 trials is 1 - (1 - (1 - 0.05) ** 500) ** 10000. If you calculate this, it will be about 0.000000073.

zero_out_of_500 = (1 - 0.05) ** 500
p = 1 - (1 - zero_out_of_500) ** 10000
print(p)  # 7.2745140578e-08

This is the probability of 7.3 out of 100,000,000 clicks on ‘Run’.

Regarding b_test_emails, I think the correct code is:

b_test_emails = np.mean(emails >= 500 * 0.08)

yeah, that is correct