A/B Testing Shoe Fly

https://www.codecademy.com/paths/data-science/tracks/data-processing-pandas/modules/dspath-agg-pandas/projects/pandas-shoefly-ab-test

Hi,

import codecademylib
import pandas as pd

ad_clicks = pd.read_csv('ad_clicks.csv')
print(ad_clicks.head(10))

utm_views = ad_clicks.groupby('utm_source').user_id.count().reset_index()
print(utm_views)

ad_clicks['is_click'] = ~ad_clicks.ad_click_timestamp.isnull()
#~ inverts the result, if it isnull then regularly it would return true, but here we want it to return false
#we create a new column 'is_click'

clicks_by_source = ad_clicks.groupby(['utm_source', 'is_click']).user_id.count().reset_index()

print(clicks_by_source)

clicks_pivot = clicks_by_source.pivot(
  columns = 'is_click',
  index = 'utm_source',
  values = 'user_id'
).reset_index()

print(clicks_pivot)

clicks_pivot['percent_clicked'] = clicks_pivot[True] / (clicks_pivot[True] + clicks_pivot[False])  <--- QUESTION HERE

print(clicks_pivot)

For Question 6

How come we use clicks_pivot[True] instead of clicks_pivot.True?

clicks_pivot['percent_clicked'] = clicks_pivot[True] / (clicks_pivot[True] + clicks_pivot[False])

Presumably, clicks_pivot[‘percent_clicked’] refers to a column name – ‘percent_clicked’.
True and False are Boolean values, and not column names. Can you follow from here to see the difference?

Oh I see, so for boolean values we need and for column names we can either use .column_name or [‘column_name’]

Is that correct?

Yes, that is correct.

1 Like

This topic was automatically closed 41 days after the last reply. New replies are no longer allowed.