Codecademy Forums

Pandas percentage calculation

Hi there,

I am doing the final project on Pandas and got stuck on Q5. Even ignoring the float function i used here, i still got a syntax error saying dataframe can’t be included.

The question is to calculate the percentage of people who visited the website ended up not placing a T-shirt in their basket.


Thanks,
Jane

Your screenshot shows use of some method named isnull, from its name I would guess that it returns a boolean. Should you really be converting that to a float?

float(False)  # usually not meaningful

But since it’s pandas and since the variable that the method is called on has a plural name, maybe it’s even something with a shape similar to:

[True, False, False, True, False]  # probably no point converting this to float

And if you read your error message, it seems to be saying something similar.

Maybe you mean to count how many are or are not null, that would be a number.


passing values to str before passing them into str.format seems redundant. str.format already converts to string… that’s what it does. Or part of what it does, anyway… Oh that’s from pandas own code. So weird.

1 Like

Thanks so much for your prompt response!
Yes! I was trying to figure out a function to count how many are/are not null. Is the correct way actually dataframe.column1.isnull().count() ?

Would appreciate any sort of inspiration.
Thanks,
Jane

1 Like

Count does something else
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.count.html
So unless you have something coinciding with what count looks at (it may possibly consider your nulls, depending on what exactly they are, to be nan’s), then no.
You’d need to count how many are true/false.
Since booleans are also integers in python, you could use sum. Or, you could use filter, and then length/size/whatever it happens to be called…or a loop

1 Like

Thanks.

I somehow sorted it!
image

>>> # how many are null?
>>> Series([5, None, 3]).isnull().sum()
1
>>> # or with filter and then get the size of what's left
>>> a = Series([5, None, 3])
>>> a[a.isnull()].size
1
>>> # how many entries?
>>> Series([5, None, 3]).size
3

Count makes me a bit nervous since it’s got to do with dealing with missing data. Do you need its ability to discard null values, or are you just after the size?

1 Like