 # A/B Testing for ShoeFly.com (Task #06)

Hi, I have a question related to this exercise. This course is found in pandas section in data science track.

My question is related to Task 6: Create a new column in `clicks_pivot` called `percent_clicked` which is equal to the percent of users who clicked on the ad from each `utm_source` .

This is my original attempt, and the `percent_clicked` column showed incorrect result:
(1st method)

``````percent = lambda row: row[True]/(row[False] + row[True])
clicks_pivot["percent_clicked"] = clicks_pivot.apply(percent, axis = 1)

``````

A answer before states that lambda function returns an integer, so one should multiply the expression by 100, but it doesn;t work.

The hint in the question is as below:
(2nd method)

``````clicks_pivot['percent_clicked'] = \
clicks_pivot[True] / \
(clicks_pivot[True] +
clicks_pivot[False])
``````

Although the second method is intuitive, I am still curious that why doesn’t the first method can’t show the correct result.

Thanks,
Frank

Seems like some kookiness involving Python2 and different behaviour when dividing two `pandas.Series` and when dividing two integers.

In Python2 dividing two integers with the `/` operator produces another integer. You can try this quickly by using the calculation for the first row if you like-

``````print(80 / (175 + 80))
# or get the type-
print(type(80 / (175 + 80)))
``````

This would explain why you get `0` as the output. If you wanted it to worked then make sure to use a float somewhere. Since you say percentages (even if you don’t use them here) perhaps multiplying by the float `100.0` would work but make sure you do that before the two integers are divided (either of the two can be a float to get a float output).

The quirkiness pops up in the division of two pandas Series which seems to provide a float64 type instead, try for example-

``````print(clicks_pivot[True] / clicks_pivot[True])
``````

would show an output like-

``````1.0
1.0
1.0
1.0
dtype float64
``````

I don’t think that behaviour exists when dividing two numpy arrays (I didn’t use pandas back when I had to use Python2) so it’s quite unexpected. There may be some history in versioning here which covers it in better detail.

Pandas compared to numpy (python2 supported versions)

``````x = pd.Series([1, 1])
y = pd.Series([1, 1])
print(x / y)
Out: 0    1.0
1    1.0
dtype: float64  # floats

x = np.array([1, 1])
y = np.array([1, 1])
print(x / y)
Out: [1 1]  # These are integers

print((x / y).dtype)
Out: int64``````