SELECT an aliased column not yet defined

Hey folks,

There seems to be “gaps” in the curriculum. I’m working through the “warby parker” funnel project in the Data Science career path and I’m at the end going through some of the extra questions in step 6.

I’m watching Rob’s video (the help video provided: https://www.youtube.com/watch?time_continue=14&v=lM2Tr7A9W1E&feature=emb_title - time is at 15:08) and he enters the following code:

WITH q AS (
SELECT ‘1-quiz’ COUNT(DISTINCT user_id)
FROM quiz)

My question is specifically related to the SELECT clause where it appears that he grabs an already (not previously aliased as such) aliased column. At this point in the curriculum, we’re taught that we can alias columns in the below manner:

WITH q AS(
SELECT COUNT(DISTINCT user_id) AS ‘1-quiz’,
FROM quiz)

Are the two clauses essentially the same? Can we eliminate the need for the AS clause by simply putting the desired alias immediately following the SELECT clause? This was not covered anywhere previously in the curriculum.

It’s also in the video that I learned we can clean up our code by aliasing the table names.

Example, instead of:

SELECT home_try_on.user id
FROM home_try_on;

use,

SELECT h.user_id
FROM home_try_on h;

Again, no where in the curriculum does it mention these tricks.

Regardless, help regarding the aliasing as requested in the first section of this question would be much appreciated.

Thanks!

1 Like

I also can’t recall where they mention aliasing tables. But, it is different than aliasing columns in a table. With tables, you don’t always need to use AS.

Think of it as shorthand. It’s easier to read your query, esp. if it’s lengthy and involved. Plus, it’s only a temporary table change (or column name change). It doesn’t actually change the name of the table in the DB.

More on that here:
https://www.tutorialspoint.com/sql/sql-alias-syntax.htm

Thanks for that link.

I’m still a bit lost particularly around the instructors code, below:

“SELECT ‘1-quiz’ AS stage”

What EXACTLY is happening there? The results return “stage” as a column and “1-quiz” as row but I am lost as to how that information is returned.

I also found another forum regarding this same part of the project (link below) that poses a good question that never got answered - it would also bring more clarity.

Thanks!

I did this Capstone project as part of a CC SQL Intensive Jan-Feb 2019. (It’s no longer offered and I think it’s part of the DA path now(?)) Anyway, I’m thinking that the help video was produced well after that (b/c I don’t recall that video on the CC platform at all.) So, I think that’s why there’s some discrepancies in the video content and what’s presented in the lesson.
In the video, ‘1-quiz’ is the first part of the funnel, the 1000 visitors to the site. In the previous query, you’re pulling out the DISTINCT users for each stage of the funnel.

Hmmmm…

I guess this is where online learning is the most difficult.

First, thank you for your help.

Second, I don’t think I’m making myself totally comprehensible. Let me try this…

If we enter:

SELECT COUNT(DISTINCT quiz.user_id) AS ‘1- Quiz’,
COUNT(DISTINCT home_try_on.user_id) AS ‘2-home_try_on’,
COUNT(DISTINCT purchase.user_id) AS ‘3-Purchase’
FROM quiz
LEFT JOIN purchase
ON quiz.user_id = purchase.user_id
LEFT JOIN home_try_on
ON home_try_on.user_id = quiz.user_id;

What returns is a horizontal format of the same values from the instructors code. Going from horizontal to vertical is cleaner but how the instructor did that (with a technique / code not taught) is befuddling.

It actually looks like someone answered the same question on another forum post: Https://www.codecademy.com/paths/analyze-data-with-sql/tracks/analyze-data-sql-analyze-real-data/modules/analyze-data-sql-usage-funnels/projects/analyze-data-sql-usage-funnels-with-warby-parker

It’d be nice if this was covered more in depth…

B

I get your frustration with the content of the project specifics not matching up with what’s presented in the Get Help video (which presumes that certain things were taught in the module). I guess it’s an issue of QA control(?) I don’t know. I can’t speak for CC but, I am guessing that since the module is older, I doubt that CC will go back and change the contents of the Marketing Funnels lesson. It’s just two different ways of returning the same information.

If you go to a site like CodeWars (where you can practice any coding language. Q’s are written by other users) and practice SQL, you’ll see that the way you arrive at an answer might be different than how others write their queries. And yet, the answer/results is/are the same. I am always amazed at the different ways people write some sql queries.
So, what you wrote above and how the instructor in the video wrote his query are just two different ways at arriving at the same results.

1 Like

Oh cool! I’ll check out that site.

I’ve narrowed down where I think I’m confused the most and it pertains to where the code is pulling the initial values.

I’ll quote my question posted to the admins, starting after the first WITH statement:

"Now if I work the code backwards entering:

SELECT ‘1-quiz’ AS stage
FROM quiz;

Without using the COUNT command, this changes all the values to ‘1-quiz’ and creates the column “Stage” and goes on for what seems like… ever.

But what is ‘1-quiz’ pulling it’s original info from? As in, what is it selecting if it wasn’t ‘1-quiz’? What is the “value” from which we are pulling the data? It looks like it’s just made up.

Or is it using the first value from the table as a default?

Is it effectively like saying:
SELECT * AS stage
FROM quiz; ?

Does that make sense?"

SELECT value1

I’m sorry, you lost me.

Did you mean to include the WITH statement prior to this?
I don’t know what it pulls b/c I’m not looking at the results. What populates the column “stage” ? I would presume it’s user_ids from the quiz funnel…

My apologies. The instructors code in the video is:

“WITH q AS(
SELECT ‘1-quiz’ AS stage, COUNT(DISTINCT user_id)
FROM quiz
),
h AS(
SELECT ‘2-home-try-on’ AS stage,
COUNT(DISTINCT user_id)
FROM home_try_on
),
p AS(
SELECT ‘3-purchase’ AS stage, COUNT(DISTINCT user_id)
FROM purchase)
SELECT *
FROM q
UNION ALL SELECT *
FROM h
UNION ALL SELECT *
FROM p;”

I understand what’s happening in the code but say I want to back up a bit.

If I isolate the code:

SELECT ‘1-quiz’ AS stage
FROM quiz;

the attached image “capture” is the result. Capture

Or if we make it interesting:

SELECT ‘poop’ AS ‘What is in my pants’
FROM quiz;

You get this (capture 2): Capture 2

What is the base value from which it is changing?

unknown -> ‘1-quiz’ OR ‘poop’ (insert whatever)

The other screwed up part of this is the schema of the tables - they’re all messed up. The supposed schema of the quiz table is here (capture 3):

But doing a SELECT * gives us the following (Capture 4):

So that is another level of confusion…

B

What was mentioned in this other thread about the literal string ‘1-quiz’, holds true: it is a value that is assigned to the column stage. And, as you’ve shown, the string can be anything and it will populate every row.

https://discuss.codecademy.com/t/https-www-codecademy-com-paths-analyze-data-with-sql-tracks-analyze-data-sql-analyze-real-data-modules-analyze-data-sql-usage-funnels-projects-analyze-data-sql-usage-funnels-with-warby-parker/445346

" In SQL, when we do something like this:
SELECT 'value' AS column
what we’re doing is creating a new column in the query called column and giving every row in the returned data set the value value .

So, SELECT '1-quiz' AS stage creates a new column - stage - and gives every row the same value of ‘1-quiz’.

Does that make sense? :slight_smile:"

Also, maybe this would be of some help:
http://www.geeksengine.com/database/basic-select/literal-character-strings.php

1 Like

That link is great!

I THINK this may be the final way to ask my questions.

SELECT ‘value’ AS column

What part of the schema is the value pulling from? Is that clearer? If not I give up lol

Also really confused as to why my schema is drastically different than the schema the instructor has.

It’s not pulling from the schema.

Whatever is in the quotations (single of double) is a string. That is the value that populates the rows in the column in your results. (As you wrote in your example several replies above).

As for the schema, I’m not sure why it looks like that. It is strange. My only suggestion is to copy all your code and reset the exercise and then paste your code back in. I’m not sure if that will reset the view of the schema or not.

2 Likes

Yeah I’ve tried resetting it multiple times to no avail. At this point, i’m moving on though I’m still perplexed on certain aspects.

I understand what a string is but I don’t understand why it populates 1000 rows - it seems as if it’s linking to every entry in the table.

Also, it seems like there’s really no need to make a complicated funnel after analyzing the contents of the table.

Entering,

SELECT COUNT(*)
FROM “table 1/2/3”;

Gives you the same numbered results - 1000, 750, 495 respectively.

There could’ve been more effort to create duplicate ID’s or something to make counting DISTINCT id’s relevant. I dunno - this project seems fun, but rushed. It’d be nice to have more details to play around with to increase my understanding of funnels and how to structure the queries.

Thank you for helping though, it’s appreciated very much.