I have a question regarding to question number 8.
What percentage of users proceeded to checkout, but did not purchase a t-shirt?
I took a different approach to this question compared to the walkthrough video but I thought it makes sense until I realized the answer I got is different than the answer in the video. I got 0.147058823529, in the video the answer is 0.16889632107. Can anyone look through my code and let me know why did I get a different answer? I looked through the code a couple times and couldn’t figure out which part might have caused a different answer.
I approached the calculation based on the merged table of all four DataFrames but in the video, the solution is to only merge Checkout and Purchase dataframes and proceed with the calculations from there just like question 5 and 6.
Here is my code:
all_data = visits.merge(cart, how = 'left').merge(checkout, how = 'left').merge(purchase, how = 'left')
#Here I am trying to find how many users have reached to the checkout phase.
checkout = len(all_data[all_data.checkout_time.notnull()])
#Then, I find out how many people did not made a purchase but reached to the checkout phase by using logical operator '&' with notnull() and isnull().
purchase_null = len(all_data[all_data.checkout_time.notnull() & all_data.purchase_time.isnull()])
Code from the video:
checkout_purchase = pd.merge(checkout,purchase, how='left')
checkout_time_rows = len(checkout_purchase)
purchase_null = len(checkout_purchase[checkout_purchase.purchase_time.isnull()])
Thank you in advance!