Isnull() in pandas and how to evaluate against Null type values

Hi guys,

Ive come across some slight confusion in the “aggregates in pandas” project.

Task asks me to create a new column based on an evaluation of an existing column, new column should display true if the existing column contains data and false if its NaN.

The project suggests the following method:

ad_clicks['is_click'] = ~ad_clicks\
   .ad_click_timestamp.isnull()

Just because im awkward, I wanted to code this as a lambda instead:

d_clicks['is_click'] = ad_clicks.ad_click_timestamp.apply(lambda x: False if x.isnull() else True)

However this just throws an error about a string not having the isnull property.

Im mighty confused by this. Firstly, what is isnull() evaluating against? In Javascript NaN is seperate to Null but both would be “falsey”. How do null values work in python (and does it have a concept of “falsey ness”? And any idea how I would write a lambda function to evaluate this while creating a new column?

I will pose the question: why make it more complicated than it needs to be?
The ~ is a NOT operator…so, you don’t need a lambda function here.

https://www.geeksforgeeks.org/working-with-missing-data-in-pandas/

.isnull() and .notnull() check for NAN values in pandas.

You could use it to check NULLS in your df (I do this when I first start working w/data):

df.isnull()

Or, df.isnull().sum() will give you the number of NULLS in each column.

Or, df.isnull().sum().sum() will give you the total number of NULLS in the df.

I get it, my solution is most definitely over complicated. I just don’t understand why it does not work. In my function I would expect this:

False if x.isnull()

to evaluate as false if a value is NaN. isnull() would return as true, meaning my statement would return as false but it just throws an error?

Does isnull() just check for NaN? Can you have a dataframe with literal Null values? (IE nothing, rather than NaN?)

The error message that you’d get is:
“ad_clicks.ad_click_timestamp.apply(lambda x: False if x.isnull() else True)
AttributeError: ‘str’ object has no attribute ‘isnull’”
You’re trying to use .isnull() on a value rather than on a series or the df.

Check out this thread:

1 Like

And, yes, you can have a df with nothing in the cell or row. In that case if you want to fill it w/something you can use df.fillna()

See: https://www.geeksforgeeks.org/python-pandas-dataframe-fillna-to-replace-null-values-in-dataframe/

1 Like

@ashleyhowe7359274843,

There is also a nice explanation of Python’s NaN and None, and their interaction with Pandas here if you are interested in understanding more on how that differs from JavaScript:

1 Like