FAQ: Usage Funnels - Build a Funnel from Multiple Tables 3

This community-built FAQ covers the “Build a Funnel from Multiple Tables 3” exercise from the lesson “Usage Funnels”.

Paths and Courses
This exercise can be found in the following Codecademy content:

FAQs on the exercise Build a Funnel from Multiple Tables 3

There are currently no frequently asked questions associated with this exercise – that’s where you come in! You can contribute to this section by offering your own questions, answers, or clarifications on this exercise. Ask or answer a question by clicking reply (reply) below.

If you’ve had an “aha” moment about the concepts, formatting, syntax, or anything else with this exercise, consider sharing those insights! Teaching others and answering their questions is one of the best ways to learn and stay sharp.

Join the Discussion. Help a fellow learner on their journey.

Ask or answer a question about this exercise by clicking reply (reply) below!

Agree with a comment or answer? Like (like) to up-vote the contribution!

Need broader help or resources? Head here.

Looking for motivation to keep learning? Join our wider discussions.

Learn more about how to use this guide.

Found a bug? Report it!

Have a question about your account or billing? Reach out to our customer support team!

None of the above? Find out where to ask other questions here!

Why is the 1.0 needed in the percentage expressions (as it looks like it’s just multiplying by 1?
Thanks

5 Likes

I had the same question too =/
Apparently it’s an implicit numeric conversion to save on a lot of coding, that produces a floating point result that is accurate (no integer division).
Source: https://codereview.stackexchange.com/questions/58369/sql-percentage-calculation

2 Likes

Ah yes, I remember the point about divisions being odd depending on the number of decimal places, makes more sense. Thanks!

why SUM and COUNT both are used in calulating percentage?

1.0 * SUM(is_checkout) / COUNT(user_id) AS 'browse_to_checkout',
1.0 * SUM(is_purchase) / SUM(is_checkout)
AS 'checkout_to_purchase'
FROM funnels;
2 Likes

User_id is not a numeric value type, so you have to use count. Since we converted the is_checkout to 1’s and 0’s, you would use SUM in this instance, since count would count the 0’s as well.

Exercise 7 within usage funnels implies we are getting all relevant data together by performing left joins on the 3 tables and building a querying on table to collect a) distinct browser dates b) user_ids for those dates c) count of checkouts associated with user ids and d) count of purchases associated with user ids - Query below:

SELECT DISTINCT b.browse_date, b.user_id, c.user_id IS NOT NULL AS 'is_checkout', p.user_id IS NOT NULL AS 'is_purchase' FROM
browse AS b
LEFT JOIN checkout AS c
ON b.user_id = c.user_id
LEFT JOIN p
ON p.user_id = c.user_id

However, some users may browse more than once on certain dates, and the same users may go to the checkout page or purchase. Wouldn’t the selecting DISTINCT browser dates filter out all of these records which are relevant?

Why can’t we use num_browse / num_checkout as the percentage?

for example : SELECT COUNT(user_id) AS ‘num_browse’,
SUM(is_checkout) AS ‘num_checkout’,
SUM(is_purchase) AS ‘num_purchase’,
1.0 * num_browse / num_checkout AS ‘percent_checkout’,
1.0 * num_checkout / num_purchase AS ‘percent_purchse’
FROM funnels;

3 Likes

I wondered about the 1.0 for different reasons, super useful to have found this out on the way! :clap: :clap: :clap:

My much more petty concern was that the columns I’d labelled as percentages were expressed as decimal percentages, where I felt it would be more useful to show them as the more familiar whole numbers, and round them to 2 decimal places, like so:

ROUND(100.0 * SUM(is_checkout) / COUNT(user_id),2)
AS ‘%_checkout’,
ROUND(100.0 * SUM(is_purchase) / SUM(is_checkout),2)
AS ‘%_purchase’

I noticed that when multiplying by 100 without the .0 I lost the decimal places I was trying to round, but looking here fixed that, so thanks again!

My reason for posting is to ask - am I right, or are the actual percentages as calculated in the sample and just very small?

Thanks in advance! :pray:

1 Like

Also had this question. Thought it might be something like this. Thanks for asking.

In 7/10, we use AS to alias several tables.

FROM browse as ‘b’
LEFT JOIN checkout as ‘c’
ON c.user_id = b.user_id
LEFT JOIN purchase as ‘p’
ON p.user_id = c.user_id

This makes sense that from when we alias checkout as c that we can say c.user_id , drawing the user_id column from a table we have nicknamed c.

But in the Warby Parker project, the Hint shows just using the first letter of the column and it seems to work without being aliased.

SELECT DISTINCT q.user_id,
h.user_id IS NOT NULL AS ‘is_home_try_on’,
h.number_of_pairs,
p.user_id IS NOT NULL AS ‘is_purchase’

Am I missing something here? How does SQL know which table I’m referring to when we simply say q.user_id?

I was thinking the same, haven’t found an answer yet

I was wondering the exact same thing. Does anyone know the answer to this?

PS! I just checked the funnels table with and without “DISTINCT”. I noticed that without the distinct the identical user ids for a given date all showed the same results for “is_checkout” and “is_purchase”. I am not sure what the reason for this is, but it seems like the system registers a checkout/buy for all the times the browser_id went to the website, even though it only checked out/bought something the last time. Not sure if this is correct or why this would be the case, but it would explain the need to use “DISTINCT”.

Does anyone else have any thoughts on this?

I have the same question. It did not work.
Also, if the 1.0 is after the mathematical equation it returns “0.0”. In other words, it does not work.

I believe I’ve seen the format work as follows:
SELECT c.user id
FROM browse b
LEFT JOIN checkout c
ON…

Does anyone know if this is unaccepted shortcut?

Does anyone know why we use count(user_id) in these calculations and the more specific count(b.user_id) doesn’t seem to work? There are multiple user_id columns in this joined table. I guess the count is the same because of left join, but isn’t being specific better?

If the third and fourth columns are calculating pecentages, shouldn’t we be multiplying by 100.0 rather than 1.0

1 Like

No, multiplying by 1.0 we do for another reason. In our calculations we received value less than 1, this value has been rounded down to the nearest integer 0. Multiplying by 1.0 we’re showing that we need decimals. And you get decimal value^ multiply it by 100 and you’ll get percentage. If you multiplying by 100 from the very beginning you’ll get 0 because of rounding.

We have only one user_id column in the resulting table. Another two “user_id”'s columns have aliases. There is no reference to “b” table in the resulting table, you can see it if start query in WITH clause, alone.
Снимок экрана 2021-12-27 в 17.16.55