None able to help here??
It seems that this exercise makes the assumption that each user only completes one visit during the entire timeframe of the data recorded in the data base. Is that correct?
I’m interested in knowing how we should adapt the query in case a given user can have browsed the website multiple times.
Can we use WHERE instead of
c.user_id IS NOT NULL AS 'is_checkout'? and why do use this
c.user_id IS NOT NULL AS 'is_checkout' anyway? is there any other way to achieve this?
SELECT DISTINCT b.browse_date, b.user_id, c.user_id IS NOT NULL AS 'is_checkout', p.user_id IS NOT NULL AS 'is_purchase' FROM browse AS 'b' LEFT JOIN checkout AS 'c' ON b.user_id = c.user_id LEFT JOIN purchase AS 'p' ON p.user_id = c.user_id LIMIT 50;
Would also be interested in this - I’ve used CASE but didn’t get any result. Curious to hear why
That’s what I wrote using CASE and got results:
SELECT DISTINCT b.browse_date AS 'Browse Date', b.user_id AS 'User ID', CASE WHEN c.user_id IS NOT NULL THEN 'True' ELSE 'False' END AS 'is_checkout', CASE WHEN p.user_id IS NOT NULL THEN 'True' ELSE 'False' END AS 'is_purchase' FROM browse AS 'b' LEFT JOIN checkout AS 'c' ON b.user_id = c.user_id LEFT JOIN purchase AS 'p' ON c.user_id = p.user_id LIMIT 50;
I’ve used WHERE as well and I get no result. As far as I might understand, the member who used CASE and got result means he/she is able to tell the program to distinguish between customers who did check-out and purchase the product and who did not. Meanwhile, using WHERE only CHOOSE people who did check-out and purchase (value is not null) while ignore who did not (value is null). For example, my code using WHERE did not return result I wanted.
SELECT b.browse_date, b.user_id, c.user_id AS 'is_checkout', p.user_id AS 'is_purchase' FROM browse as 'b' LEFT JOIN checkout as 'c' ON b.user_id = c.user_id LEFT JOIN purchase as 'p' ON c.user_id = p.user_id WHERE c.user_id IS NOT NULL AND p.user_id IS NOT NULL LIMIT 50;
Within the context of this exercise, why are we joining the three tables using user_id and not for example using item_id?
I have included item_id to provide a clearer understanding of the browse-checkout-purchase process of user.
I also added few AND after ON to make sure browse’s item, user and date correspond to checkout ones and also checkout data correspond to purchase data.
SELECT b.item_id, b.browse_date, b.user_id, c.user_id IS NOT NULL AS ‘is_checkout’, p.user_id IS NOT NULL AS ‘is_purchase’
FROM browse AS “b”
LEFT JOIN checkout AS “c”
ON c.user_id = b.user_id AND c.checkout_date=b.browse_date AND c.item_id=b.item_id
LEFT JOIN purchase AS ‘p’
ON p.user_id = c.user_id AND c.checkout_date=p.purchase_date AND c.item_id=p.item_id
Hi there! Theoretical question, on the result the columns is_checkout/ is_purchase I understand that 0-1 means True-False but wherein the code is that order? Is it on IS NOT NULL? Can someone explain to me what’s the process in the program where that variable is transformed from date to 0-1.
Just so I am clear… in the previous step code it shows that the tables can be aliased using AS then in this code it shows that we can either use AS or just the alias name in ’ …’ - IS this why the code has an AS ‘b’ in the FROM browse AS ‘b’ and then not when aliasing checkout ‘c’ in the LEFT JOIN ???
SELECT DISTINCT b.browse_date,
c.user_id IS NOT NULL AS ‘is_checkout’,
p.user_id IS NOT NULL AS ‘is_purchase’
FROM browse AS ‘b’
LEFT JOIN checkout ‘c’
ON c.user_id = b.user_id
LEFT JOIN purchase ‘p’
ON p.user_id = b.user_id
Hi, I actually had the same question and am confused regarding the input:
“c.user_id IS NOT NULL AS ‘is_checkout’, p.user_id IS NOT NULL AS ‘is_purchase’”
The examples I’ve seen so far have always used “IS (NOT) NULL” in a where clause, but never in a SELECT clause; if used in the context of a WHERE clause, I’d imagine the query would pull the actual user_id’s that are simply not null and not a binary. Why is it a binary when used in a SELECT statement?
Is it possible that because there is a specific row for each item_id purchased (same user, same purchase date) a GROUP BY would return multiple checkouts and purchases for one transaction where the the use of DISTINCT would only count each cart and transaction once, regardless of the number of items?