Https://www.codecademy.com/paths/analyze-data-with-sql/tracks/analyze-data-sql-analyze-real-data/modules/analyze-data-sql-usage-funnels/projects/analyze-data-sql-usage-funnels-with-warby-parker

I am hoping someone help me solve the confusion of the 5 question in project ‘Usage funnel Warby Parker’. The codes from solutions are as followed:

WITH q AS(
SELECT ‘1-quiz’ AS stage, COUNT(DISTINCT user_id)
FROM quiz
),
h AS(
SELECT ‘2-home-try-on’ AS stage,
COUNT(DISTINCT user_id)
FROM home_try_on
),
p AS(
SELECT ‘3-purchase’ AS stage, COUNT(DISTINCT user_id)
FROM purchase)
SELECT *
FROM q
UNION ALL SELECT *
FROM h
UNION ALL SELECT *
FROM p;

My question is, how can we choose ‘1-quiz’, ‘2-home_try_on’ in the WITH…AS… clause? Since the ‘1-quiz’ AS stage doesn’t exist in the databased quiz…Please explain.

Hi @shawnxcai126 - welcome to the forum.

In SQL, when we do something like this:
SELECT 'value' AS column
what we’re doing is creating a new column in the query called column and giving every row in the returned data set the value value.

So, SELECT '1-quiz' AS stage creates a new column - stage - and gives every row the same value of ‘1-quiz’.

Does that make sense? :slight_smile:

1 Like

I had the same question on another post: SELECT an aliased column not yet defined

I think what those of us are finding frustrating is that this is a major gap in the curriculum as that functionality of SQL hasn’t been taught - I’m not even sure if it even gets to that point.

We’ve only learned to SELECT value AS ‘some_name’

Granted the code used in the question is exceptionally neater, we can return the same relevant data by using:

SELECT COUNT(DISTINCT quiz.user_id) AS ‘1- Quiz’,
COUNT(DISTINCT home_try_on.user_id) AS ‘2-home_try_on’,
COUNT(DISTINCT purchase.user_id) AS ‘3-Purchase’
FROM quiz
LEFT JOIN purchase
ON quiz.user_id = purchase.user_id
LEFT JOIN home_try_on
ON home_try_on.user_id = quiz.user_id;

At this stage of the curriculum, this is essentially how we are taught to approach it.

I understand (ish) how this works but understanding WHEN to utilize this is where there’s a massive gap.

Now if I work the code backwards entering:

SELECT ‘1-quiz’ AS stage
FROM quiz;

This changes all the values to ‘1-quiz’ and creates the column “Stage” and goes on for what seems like… ever.

But what is ‘1-quiz’ pulling it’s original info from? As in, what is it selecting if it wasn’t ‘1-quiz’? What is the “value” from which we are pulling the data? It looks like it’s just made up.

Is it effectively like saying:
SELECT * AS stage
FROM quiz; ?

Does that make sense?

Apologies for jumping on this forum post - feel free to respond on my original post.