FAQ: Introduction to Statistics with NumPy - NumPy and Standard Deviation, Part II

This community-built FAQ covers the “NumPy and Standard Deviation, Part II” exercise from the lesson “Introduction to Statistics with NumPy”.

Paths and Courses
This exercise can be found in the following Codecademy content:

Data Science

Introduction to Statistics with NumPy

FAQs on the exercise NumPy and Standard Deviation, Part II

Join the Discussion. Help a fellow learner on their journey.

Ask or answer a question about this exercise by clicking reply (reply) below!

Agree with a comment or answer? Like (like) to up-vote the contribution!

Need broader help or resources? Head here.

Looking for motivation to keep learning? Join our wider discussions.

Learn more about how to use this guide.

Found a bug? Report it!

Have a question about your account or billing? Reach out to our customer support team!

None of the above? Find out where to ask other questions here!

Hi,

With regards to NumPy and Standard Deviation, Part II…

Can someone explain why the dataset with a greater standard deviation will be preferred over a smaller standard deviation?

Q3 - Determine the squash dataset that has the greater standard deviation and save it to the variable winner .

Thank you.

1 Like

that variable name makes no sense, in my opinion.

1 Like

I tried to do the final part of this to determine the winner with a lambda but what I got back was not useful (and also not a normal error?). The terminal would print “<function at 0x7f2608888650>”
My thinking was if the difference between the pumpkin and acorn squash std was greater than 1, then it would print pumpkin, else squash.
If anyone has any insight as to why this didn’t work, please let me know.

import numpy as np
pumpkin = np.array([68, 1820, 1420, 2062, 704, 1156, 1857, 1755, 2092, 1384])
acorn_squash = np.array([20, 43, 99, 200, 12, 250, 58, 120, 230, 215])
pumpkin_avg = np.mean(pumpkin)
acorn_squash_avg = np.mean(acorn_squash)
pumpkin_std = np.std(pumpkin)
acorn_squash_std = np.std(acorn_squash)
std_difference = pumpkin_std - acorn_squash_std
print std_difference
winner = lambda std_difference: “Pumpkin” if std_difference >= 0 else “Acorn Squash”
print(winner)

the lambda works like a function… and you can rewrite your function like this:

winner = lambda std_difference: “Pumpkin” if std_difference >= 0 else “Acorn Squash”

is the same as…

def winner(std_difference):
   if std_difference >= 0:
      return 'Pumpkin'
   else:
      return 'Acorn Squash'

now, pretend that you are calling this function above, how would you call? Any parameter required on the call? :slight_smile:

Hope it helps!

Thanks for your response, you really have me thinking.
Okay, so I can see how a function would be more applicable to this. But again, when I put this into the terminal (I put the function near the top after importing numpy) I still get this weird output to the terminal. “<function winner at 0x7f3f1cd6d950>”

Also, if I use a function, I shouldn’t need to calculate the difference in another variable, I could just use the pumpkin_std and acorn_squash_std, right?

Thanks for your help, I really appreciate you tolerating an insufferable noob.

the function and the lambda are exactly the same thing in this case, so both should work fine.
the only problem is on the way you are calling it on the print( ).

print(winner) will not work, because winner is not a variable, but a function… and for this function to work, it will need a parameter.

so, it should be like:
print( winner(something here) )

if you still are in doubt, I can let you know what to put there. just let me know!

maybe what is confusing you is that you are using “std_difference” on the lambda expression, and thinking that this hold the actual std_difference that you calculated previously.

winner = lambda std_difference: “Pumpkin” if std_difference >= 0 else “Acorn Squash”
is the same as…
winner = lambda x: “Pumpkin” if x >= 0 else “Acorn Squash”
is the same as…
winner = lambda abc: “Pumpkin” abc >= 0 else “Acorn Squash”

so looking at the expression like this, you might understand why you need to use a parameter for “winner ( )”, and what parameter would that be.

The test here should be redone.

Determine the squash dataset that has the greater standard deviation and save it to the variable winner .

winner = pumpkin_std if pumpkin_std > acorn_squash_std else acorn_squash_std

Fails which is the same as doing this

winner = pumpkin