FAQ: Usage Funnels - Build a Funnel from Multiple Tables 2

hi, do you know if using group by instead of distinct would have give the same results?

my query is this:

SELECT

b.browse_date,

b.user_id,

CASE WHEN c.user_id IS NOT NULL THEN ‘True’ ELSE ‘False’ END AS ‘is_checkout’,

CASE WHEN p.user_id IS NOT NULL THEN ‘True’ ELSE ‘False’ END AS ‘is_purchase’

FROM browse b

LEFT JOIN checkout c

ON b.user_id = c.user_id

LEFT JOIN purchase p

ON c.user_id = p.user_id

GROUP BY 1,2

LIMIT 50;

what about a group by?
would have been correct as well?
my query is:

SELECT

b.browse_date,

b.user_id,

CASE WHEN c.user_id IS NOT NULL THEN ‘True’ ELSE ‘False’ END AS ‘is_checkout’,

CASE WHEN p.user_id IS NOT NULL THEN ‘True’ ELSE ‘False’ END AS ‘is_purchase’

FROM browse b

LEFT JOIN checkout c

ON b.user_id = c.user_id

LEFT JOIN purchase p

ON c.user_id = p.user_id

GROUP BY 1,2

LIMIT 50;

How does b.browse_date, b.user_id, c.user_id, p.user_id in this query work if these aren’t defined until the Alias?

SELECT DISTINCT b.browse_date, b.user_id, c.user_id IS NOT NULL AS ‘is_checkout’, p.user_id IS NOT NULL AS ‘is_purchase’
FROM browse AS ‘b’
LEFT JOIN checkout AS ‘c’
ON c.user_id = b.user_id
LEFT JOIN purchase AS ‘p’
ON p.user_id = c.user_id
LIMIT 50;

To my understanding SQL doesn’t run line by line like some coding languages do, but instead runs all at once after it validates. Thus, defining the aliases after they’re referenced doesn’t matter (i.e. “p.user_id” as “p” can be used in the select statement before we’ve created an alias to define purchase as p).

I’m confused as to why the ‘is_checkout’ and ‘is_purchase’ output binary (1 or 0) and don’t show a user_id string instead?

whoops. Step 8 answered my own question… IS NOT NULL is a true/false function.

None able to help here??

Hello
It seems that this exercise makes the assumption that each user only completes one visit during the entire timeframe of the data recorded in the data base. Is that correct?
I’m interested in knowing how we should adapt the query in case a given user can have browsed the website multiple times.

Can we use WHERE instead of c.user_id IS NOT NULL AS 'is_checkout'? and why do use this c.user_id IS NOT NULL AS 'is_checkout' anyway? is there any other way to achieve this?

SELECT DISTINCT b.browse_date, b.user_id, c.user_id IS NOT NULL AS 'is_checkout', p.user_id IS NOT NULL AS 'is_purchase'
FROM browse AS 'b'
LEFT JOIN checkout AS 'c'
ON b.user_id = c.user_id
LEFT JOIN purchase AS 'p'
ON p.user_id = c.user_id
LIMIT 50;

1 Like

Would also be interested in this - I’ve used CASE but didn’t get any result. Curious to hear why

That’s what I wrote using CASE and got results:

SELECT DISTINCT b.browse_date AS 'Browse Date',
   b.user_id AS 'User ID',
   CASE
    WHEN c.user_id IS NOT NULL THEN 'True'
    ELSE 'False'
    END AS 'is_checkout',
   CASE
    WHEN p.user_id IS NOT NULL THEN 'True'
    ELSE 'False'
    END AS 'is_purchase'
FROM browse AS 'b'
LEFT JOIN checkout AS 'c'
  ON b.user_id = c.user_id
LEFT JOIN purchase AS 'p'
  ON c.user_id = p.user_id
LIMIT 50;

I’ve used WHERE as well and I get no result. As far as I might understand, the member who used CASE and got result means he/she is able to tell the program to distinguish between customers who did check-out and purchase the product and who did not. Meanwhile, using WHERE only CHOOSE people who did check-out and purchase (value is not null) while ignore who did not (value is null). For example, my code using WHERE did not return result I wanted.

SELECT b.browse_date,
b.user_id,
c.user_id AS 'is_checkout',
p.user_id AS 'is_purchase'
FROM browse as 'b'
LEFT JOIN checkout as 'c'
ON b.user_id = c.user_id
LEFT JOIN purchase as 'p'
ON c.user_id = p.user_id
WHERE c.user_id IS NOT NULL
AND p.user_id IS NOT NULL
LIMIT 50;

Within the context of this exercise, why are we joining the three tables using user_id and not for example using item_id?

1 Like

I have included item_id to provide a clearer understanding of the browse-checkout-purchase process of user.

I also added few AND after ON to make sure browse’s item, user and date correspond to checkout ones and also checkout data correspond to purchase data.

SELECT b.item_id, b.browse_date, b.user_id, c.user_id IS NOT NULL AS ‘is_checkout’, p.user_id IS NOT NULL AS ‘is_purchase’

FROM browse AS “b”

LEFT JOIN checkout AS “c”

ON c.user_id = b.user_id AND c.checkout_date=b.browse_date AND c.item_id=b.item_id

LEFT JOIN purchase AS ‘p’

ON p.user_id = c.user_id AND c.checkout_date=p.purchase_date AND c.item_id=p.item_id

LIMIT 50;

Hi there! Theoretical question, on the result the columns is_checkout/ is_purchase I understand that 0-1 means True-False but wherein the code is that order? Is it on IS NOT NULL? Can someone explain to me what’s the process in the program where that variable is transformed from date to 0-1.

Thanks guys!

Just so I am clear… in the previous step code it shows that the tables can be aliased using AS then in this code it shows that we can either use AS or just the alias name in ’ …’ - IS this why the code has an AS ‘b’ in the FROM browse AS ‘b’ and then not when aliasing checkout ‘c’ in the LEFT JOIN ???
SELECT DISTINCT b.browse_date,
b.user_id,
c.user_id IS NOT NULL AS ‘is_checkout’,
p.user_id IS NOT NULL AS ‘is_purchase’
FROM browse AS ‘b’
LEFT JOIN checkout ‘c’
ON c.user_id = b.user_id
LEFT JOIN purchase ‘p’
ON p.user_id = b.user_id
LIMIT 50;

Hi, I actually had the same question and am confused regarding the input:

“c.user_id IS NOT NULL AS ‘is_checkout’, p.user_id IS NOT NULL AS ‘is_purchase’”

The examples I’ve seen so far have always used “IS (NOT) NULL” in a where clause, but never in a SELECT clause; if used in the context of a WHERE clause, I’d imagine the query would pull the actual user_id’s that are simply not null and not a binary. Why is it a binary when used in a SELECT statement?

Is it possible that because there is a specific row for each item_id purchased (same user, same purchase date) a GROUP BY would return multiple checkouts and purchases for one transaction where the the use of DISTINCT would only count each cart and transaction once, regardless of the number of items?

So I have the code for this exercise but I was wondering how I would add in a ‘count’ here to tell me how many people went through each phase? would love any help!

SELECT DISTINCT b.browse_date,
b.user_id,
c.user_id IS NOT NULL AS ‘is_checkout’,
p.user_id IS NOT NULL AS ‘is_purchase’
FROM browse AS ‘b’
LEFT JOIN checkout ‘c’
ON c.user_id = b.user_id
LEFT JOIN purchase ‘p’
ON p.user_id = c.user_id
LIMIT 50;

:pray: :pray: :pray:HELP PLEASE, I HAVE INTERVIEW NEXT WEEK :pray:t4: :pray:t4: :pray:t4:
I am not able to understand these statements. I have no idea how these statements are working as no prior lessons were given on NOT NULL for 0 and 1 or TRUE or FALSE in return when used in SELECT statement :rage:

SELECT DISTINCT b.browse_date,
b.user_id,
c.user_id IS NOT NULL AS ‘is_checkout’,
p.user_id IS NOT NULL AS ‘is_purchase’

I tried to use the CASE-WHEN-THEN clause but not getting any result :expressionless:. Please help me where I am wrong here, if anybody has tried CASE-WHEN-THEN clause* :pray:

SELECT b.browse_date, b.user_id
(CASE
WHEN c.user_id IS NOT NULL THEN ‘TRUE’
ELSE ‘FALSE’
END) AS ‘is_checkout’,
(CASE
WHEN p.user_id IS NOT NULL THEN ‘TRUE’
ELSE ‘FALSE’
END) AS ‘is_purchase’
FROM browse b
LEFT JOIN checkout c
ON b.user_id=c.user_id
LEFT JOIN purchase p
ON p.user_id=c.user_id
LIMIT 50;