How to create new column in dataframe using lambda function

Hi,

I am learning how to use aggregates in pandas. In a project, I am asked to create a new column in a dataframe based on values from another column. The instruction reads like this:

If the column ad_click_timestamp is not null, then someone actually clicked on the ad that was displayed. Create a new column called is_click , which is True if ad_click_timestamp is not null and False otherwise.

I have been under the impression, that I can solve this by using a lambda function:
ad_clicks['is_click'] = ad_clicks['ad_click_timestamp'].apply(lambda x: x.notnull())

However, this evaluates to an error: AttributeError: 'str' object has no attribute 'notnull'

I then tried:
ad_clicks['is_click'] = ad_clicks['ad_click_timestamp'].apply(lambda x: True if x != 'nan' else False)

However, in this case all values of my newly created column evaluate to True.

How can I use a lambda function to solve this exercise ?

Thanks, Michael.

This is for the Shoefly project, correct?

You actually don’t need a lambda function to add a new column. You need that column to evaluate to either True or False. That can be accomplished by using ~ or, a tilde.

ad_clicks['is_click'] = ~ad_clicks.ad_click_timestamp.isnull()

print(ad_clicks)

This will test whether the value is null and is not null.

The tilde is a bitwise operator and will perform the logical negation. It is the inverse or, complement operation.
https://izziswift.com/the-tilde-operator-in-python/

1 Like

Hi, @lisalisaj,

thank you for your reply. In fact, this is the proposed solution, I just wanted to find out, how to get to this using a lambda function. As some practice lets say :slight_smile:

I have spend some more time in the pandas documentation and I have found this option:

ad_clicks['is_click'] = ad_clicks['ad_click_timestamp'].apply(lambda x: True if pd.notnull(x) else False)

Thank you for your help !!

Thanks, Michael.

2 Likes

Ah, okay. gotcha! :slight_smile:
You are correct, in that there’s usually more than one way to accomplish the same task.
(I was just trying to be a little bit more efficient, that’s all. :slight_smile: )

1 Like

Very interesting! In your first lambda function, did you find out what the problem was? Maybe it was that x was just a string so didn’t have a notnull() method, whereas in your solution you called the pandas.notnull function instead, which can take a scalar.

In your second lambda function, I’m guessing that problem was that NaN is a different data type from the string nan so x would never be equal to nan

I’ve got another question which might be similar. In Step 6 we have to create a new column showing the percentage of users who clicked. I understand how the solution in the Hint is better, but can anyone help with why this doesn’t work?

clicks_pivot['percent_clicked'] = clicks_pivot.apply(lambda row: row[True] / (row[True] + row[False]), axis = 1)

I just get zero values for the whole column.