A/B Testing for ShoeFly.com

I’m on A/B Testing for ShoeFly.com of Data Engineer path and I have some questions… First, here’s the page link: A/B Testing for ShoeFly.com

It’s about 6th question:

Create a new column in clicks_pivot called percent_clicked which is equal to the percent of users who clicked on the ad from each utm_source.

Was there a difference in click rates for each source?

And here’s my answer:

clicks_pivot["percent_clicked"] = clicks_pivot.apply(lambda row: row["True"] / (row["True"] + row["False"]), axis = 1)

Can someone show me where I did wrong?


What was the result of your code for that question? Did it create a column with the % of people who clicked on an ad?

Thanks for your reply…! I’ve got this intimidating error message… :frowning:

KeyError Traceback (most recent call last)
File /usr/local/lib/python3.8/dist-packages/pandas/core/indexes/base.py:3803, in Index.get_loc(self, key, method, tolerance)
3802 try:
→ 3803 return self._engine.get_loc(casted_key)
3804 except KeyError as err:

File /usr/local/lib/python3.8/dist-packages/pandas/_libs/index.pyx:138, in pandas._libs.index.IndexEngine.get_loc()

File /usr/local/lib/python3.8/dist-packages/pandas/_libs/index.pyx:165, in pandas._libs.index.IndexEngine.get_loc()

File pandas/_libs/hashtable_class_helper.pxi:5745, in pandas._libs.hashtable.PyObjectHashTable.get_item()

File pandas/_libs/hashtable_class_helper.pxi:5753, in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: ‘True’

The above exception was the direct cause of the following exception:

KeyError Traceback (most recent call last)
Input In [12], in <cell line: 1>()
----> 1 clicks_pivot[“percent_clicked”] = clicks_pivot.apply(lambda row: row[“True”] / (row[“True”] + row[“False”]), axis = 1)

File /usr/local/lib/python3.8/dist-packages/pandas/core/frame.py:9555, in DataFrame.apply(self, func, axis, raw, result_type, args, **kwargs)
9544 from pandas.core.apply import frame_apply
9546 op = frame_apply(
9547 self,
9548 func=func,
9553 kwargs=kwargs,
9554 )
→ 9555 return op.apply().finalize(self, method=“apply”)

File /usr/local/lib/python3.8/dist-packages/pandas/core/apply.py:746, in FrameApply.apply(self)
743 elif self.raw:
744 return self.apply_raw()
→ 746 return self.apply_standard()

File /usr/local/lib/python3.8/dist-packages/pandas/core/apply.py:873, in FrameApply.apply_standard(self)
872 def apply_standard(self):
→ 873 results, res_index = self.apply_series_generator()
875 # wrap results
876 return self.wrap_results(results, res_index)

File /usr/local/lib/python3.8/dist-packages/pandas/core/apply.py:889, in FrameApply.apply_series_generator(self)
886 with option_context(“mode.chained_assignment”, None):
887 for i, v in enumerate(series_gen):
888 # ignore SettingWithCopy here in case the user mutates
→ 889 results[i] = self.f(v)
890 if isinstance(results[i], ABCSeries):
891 # If we have a view on v, we need to make a copy because
892 # series_generator will swap out the underlying data
893 results[i] = results[i].copy(deep=False)

Input In [12], in (row)
----> 1 clicks_pivot[“percent_clicked”] = clicks_pivot.apply(lambda row: row[“True”] / (row[“True”] + row[“False”]), axis = 1)

File /usr/local/lib/python3.8/dist-packages/pandas/core/series.py:981, in Series.getitem(self, key)
978 return self._values[key]
980 elif key_is_scalar:
→ 981 return self._get_value(key)
983 if is_hashable(key):
984 # Otherwise index.get_value will raise InvalidIndexError
985 try:
986 # For labels that don’t resolve as scalars like tuples and frozensets

File /usr/local/lib/python3.8/dist-packages/pandas/core/series.py:1089, in Series._get_value(self, label, takeable)
1086 return self._values[label]
1088 # Similar to Index.get_value, but we do not fall back to positional
→ 1089 loc = self.index.get_loc(label)
1090 return self.index._get_values_for_loc(self, loc, label)

File /usr/local/lib/python3.8/dist-packages/pandas/core/indexes/base.py:3805, in Index.get_loc(self, key, method, tolerance)
3803 return self._engine.get_loc(casted_key)
3804 except KeyError as err:
→ 3805 raise KeyError(key) from err
3806 except TypeError:
3807 # If we have a listlike key, _check_indexing_error will raise
3808 # InvalidIndexError. Otherwise we fall through and re-raise
3809 # the TypeError.
3810 self._check_indexing_error(key)

KeyError: ‘True’

Is True a string or a boolean? You’re getting that error b/c True isn’t a key

So, I see what you’re trying to do here with the lambda function, but you don’t really need it.

You can derive the percentage just by doing some math and selecting the rows that are True:

clicks_pivot[True] / (clicks_pivot[True] + clicks_pivot[False])*100.0