Hello everybody,
i merged all four steps of the funnel, in order, using a series of left merges . Save the results to the variable all_data, as the project asked.
And if you research the tables visits and all_data using .info(). You get following infomations:
Could you please format your code and add a link to the relevant lesson/project. If you cannot edit your first response then include the details as a reply following the guidance given at the link below-
That is a good question. This is the count that I came up with …
2000 number of visits to site
348 number of users who added to cart
360 number of users who went to check-out
252 number of users who completed purchase
The reason why all_data has a number of rows greater than visits is because a user can make more than one purchase and then its user_id and corresponding data will appear more than once in all_data. Notice how some users in all_data appear more than one time - for instance, the user identified by the id 21dec5fa-999a-45c5-b59b-18a1ee161379 made 20 purchases. You can visualize this with the groupby method.
all_data.groupby("user_id").count()
This returns another dataframe with the number of times they visited (only 1 per user - so it will have 2000 rows) and the other columns have the number of times they entered the cart/checkout/made a purchase.
Also note that the number of unique user_id in all_data is still 2000 which is consistent with visits.