This community-built FAQ covers the “Build a Funnel from Multiple Tables 2” exercise from the lesson “Usage Funnels”.
Paths and Courses
This exercise can be found in the following Codecademy content:
FAQs on the exercise Build a Funnel from Multiple Tables 2
There are currently no frequently asked questions associated with this exercise – that’s where you come in! You can contribute to this section by offering your own questions, answers, or clarifications on this exercise. Ask or answer a question by clicking reply () below.
If you’ve had an “aha” moment about the concepts, formatting, syntax, or anything else with this exercise, consider sharing those insights! Teaching others and answering their questions is one of the best ways to learn and stay sharp.
Join the Discussion. Help a fellow learner on their journey.
Ask or answer a question about this exercise by clicking reply () below!
Agree with a comment or answer? Like () to up-vote the contribution!
Can you really refer to an alias before the AS statement has created the alias?
select
distinct b.browse_date,
b.user_id,
c.user_id is not null as 'is_checkout',
p.user_id is not null as 'is_purchased'
from browse as 'b'
left join checkout as 'c'
on c.user_id = b.user_id
left join purchase as 'p'
on p.user_id = c.user_id
limit 50;
I’m not a SQL professional, but it seems that yes, you can.
I’t looks like that SQL, the first, creates the table (using aliases) and the second, selects solumns from the table.
Would you tell me, please, the meaning of using DISTINCT in the code below (when we create sales funnel).
And did it apply to all the columns after the expression or to the only?
Thank you.
SELECT DISTINCT b.browse_date,
b.user_id, c.user_id IS NOT NULL AS 'is_checkout',
p.user_id IS NOT NULL AS 'is_purchase'
FROM browse AS 'b'
LEFT JOIN checkout AS 'c'
ON b.user_id = c.user_id
LEFT JOIN purchase AS 'p'
ON c.user_id = p.user_id
LIMIT 50;
[T]he DISTINCT clause must appear immediately after the SELECT keyword.
… [Y]ou place a column or a list of columns after the DISTINCT keyword. If you use one column, SQLite uses values in that column to evaluate the duplicate. [If] you use multiple columns, SQLite uses the combination of values in these columns to evaluate the duplicate.
So in our case, DISTINCT is looking for unique combinations of the 4 columns, not just unique user_id’s.
If the same user, on the same day, got to the same point in the funnel that they did earlier that day there would be a duplicate row in the table which we wouldn’t count using DISTINCT.
The exercise suggests joining c to b, then p to c, as below:
SELECT *
FROM browse AS ‘b’
LEFT JOIN checkout AS ‘c’
ON c.user_id = b.user_id
LEFT JOIN purchase AS ‘p’
ON p.user_id = c.user_id
LIMIT 50;
Does the choice of tables to join and the order you join them in matter here? I imagine it does, as it’s a LEFT JOIN so you might lose rows if you’re not careful, but in that case why not just join both c and p to b directly?
SELECT *
FROM browse AS ‘b’
LEFT JOIN checkout AS ‘c’
ON c.user_id = b.user_id
LEFT JOIN purchase AS ‘p’
ON b.user_id = p.user_id
LIMIT 50;
The purpose of using DISTINCT is that you don’t get duplicates of the data for a given set of columns you are selecting. So if two rows have the same values for all four columns you have selected, DISTINCT would make sure that the combination of values is only displayed once instead of multiple times.
I have a question on the query since a customer may have multiple items in the same day, wouldn’t the query in the join lead to many to many join. Shouldn’t we create the final table at a cm11 level, item_id and browse_date. This will give us the information for that specific item, what was the customer journey of the customer like and at which stage did he pause.
How does b.browse_date, b.user_id, c.user_id, p.user_id in this query work if these aren’t defined until the Alias?
SELECT DISTINCT b.browse_date, b.user_id, c.user_id IS NOT NULL AS ‘is_checkout’, p.user_id IS NOT NULL AS ‘is_purchase’
FROM browse AS ‘b’
LEFT JOIN checkout AS ‘c’
ON c.user_id = b.user_id
LEFT JOIN purchase AS ‘p’
ON p.user_id = c.user_id
LIMIT 50;
To my understanding SQL doesn’t run line by line like some coding languages do, but instead runs all at once after it validates. Thus, defining the aliases after they’re referenced doesn’t matter (i.e. “p.user_id” as “p” can be used in the select statement before we’ve created an alias to define purchase as p).