Help With Election Result Project in Data Science Path


#1

This is a question on Election Results, which is part of the “Statistics in Numpy” course.

The problem I’m having is with Step 6.

As we saw, 47% of people we surveyed said they would vote for Ceballos, but 54% of people voted for Ceballos in the actual election.

Calculate the percentage of surveys that could have an outcome of Ceballos receiving less than 50% of the vote and save it to the variable ceballos_loss_surveys .

Print the variable to the terminal.

The hint for this step is:

np.mean(array < 0.5)

Which I thought would mean this step was fairly simple. So I used this:

ceballos_loss_surveys = np.mean(possible_surveys < .5)

At first I was getting 0.0 as a result, so I checked the code in the tutorial video, and this is what the instructor provided:

possible_survey_length = float(len(possible_surveys))

incorrect_predictions = len(possible_surveys[possible_surveys < .5])

ceballos_loss_surveys = incorrect_predictions / possible_survey_length

Now when I run the two side by side, I get the same results (I can only assume I did something wrong and fixed it later but idk tbh).

Why would the code provided by the instructor be so much more complicated than the extremely simple line of code I used? If I get the same results, what’s the main difference?

Here is the entire code prior to this step:

total_ceballos = sum([1 for n in survey_responses if n == 'Ceballos'])

print(total_ceballos)

# 33

survey_length = float(len(survey_responses))

percentage_ceballos = 100 * total_ceballos/survey_length

print(percentage_ceballos)

#47.1428571429

possible_surveys = np.random.binomial(survey_length, .54, size=10000) / survey_length

plt.hist(possible_surveys, range=(0, 1), bins=20)
plt.show()

#2

The diffrence here is that you’re instructor did not use numpy. The math for calculating the mean of a array of numbers is as followed: mean = sum of all number / count of all numbers

Numpy has a function called mean() wich takes an array and returns the mean. You might be using np.mean() instead of the code your instructor provided but behind the scene the same stuff is happening.


#3

Could you please explain what would the code be like when using np.mean()?