A/B Testing Shoe Fly



import codecademylib
import pandas as pd

ad_clicks = pd.read_csv('ad_clicks.csv')

utm_views = ad_clicks.groupby('utm_source').user_id.count().reset_index()

ad_clicks['is_click'] = ~ad_clicks.ad_click_timestamp.isnull()
#~ inverts the result, if it isnull then regularly it would return true, but here we want it to return false
#we create a new column 'is_click'

clicks_by_source = ad_clicks.groupby(['utm_source', 'is_click']).user_id.count().reset_index()


clicks_pivot = clicks_by_source.pivot(
  columns = 'is_click',
  index = 'utm_source',
  values = 'user_id'


clicks_pivot['percent_clicked'] = clicks_pivot[True] / (clicks_pivot[True] + clicks_pivot[False])  <--- QUESTION HERE


For Question 6

How come we use clicks_pivot[True] instead of clicks_pivot.True?

clicks_pivot['percent_clicked'] = clicks_pivot[True] / (clicks_pivot[True] + clicks_pivot[False])

Presumably, clicks_pivot[‘percent_clicked’] refers to a column name – ‘percent_clicked’.
True and False are Boolean values, and not column names. Can you follow from here to see the difference?

Oh I see, so for boolean values we need and for column names we can either use .column_name or [‘column_name’]

Is that correct?

Yes, that is correct.

1 Like

This topic was automatically closed 41 days after the last reply. New replies are no longer allowed.