Codecademy Forums

FAQ: Subqueries - Correlated Subqueries I

This community-built FAQ covers the “Correlated Subqueries I” exercise from the lesson “Subqueries”.

Paths and Courses
This exercise can be found in the following Codecademy content:

SQL: Table Transformation

FAQs on the exercise Correlated Subqueries I

There are currently no frequently asked questions associated with this exercise – that’s where you come in! You can contribute to this section by offering your own questions, answers, or clarifications on this exercise. Ask or answer a question by clicking reply (reply) below.

If you’ve had an “aha” moment about the concepts, formatting, syntax, or anything else with this exercise, consider sharing those insights! Teaching others and answering their questions is one of the best ways to learn and stay sharp.

Join the Discussion. Help a fellow learner on their journey.

Ask or answer a question about this exercise by clicking reply (reply) below!

Agree with a comment or answer? Like (like) to up-vote the contribution!

Need broader help or resources? Head here.

Looking for motivation to keep learning? Join our wider discussions.

Learn more about how to use this guide.

Found a bug? Report it!

Have a question about your account or billing? Reach out to our customer support team!

None of the above? Find out where to ask other questions here!

I don’t understand the importance of
WHERE carrier = f.carrier

Could someone explain please?

4 Likes

The average value is calculated based on “carrier”, it is already compared with “distance”

2 Likes

This was confusing for me too.

I think since SQL is accessing the same table “flights” and column “distance” based on carriers SQL needs to distinguish the difference between the two. One instance of carriers is holding all the distances as “f.carrier” while the the other is holding the average (AVG) distances as just “carriers” so we are now comparing the two here: WHERE carrier = f.carrier I think SQL needs to be able to distinguish the two in order to use the < or > operators in the above query and give us the appropriate id associated to those carriers who are above or below average.

This is basically what @smilexdrus has stated and what I think I understood from it. I’m just trying to be more explanatory about it.

3 Likes

Why is there the following?
f.origin = flights.origin

Aren’t they both referring to the same chart and data?

1 Like

My understanding is that this will calculate the average distance for each carrier every time it comes up in the flight list.
i.e. the average for each carrier is calculated multiple times.

Is that true?
If so, is this a wasteful/slow way of doing it?
If so, how would you go about doing it otherwise? Would you create a table of carrier names and averages then look up the value in that? Would that actually speed things up?

Thanks in advance,
Alex

I don’t understand this:
SELECT id
FROM flights AS f
Why “as f”?
I tried omitting the “as f” part and used “WHERE carrier = flights.carrier” instead of “WHERE carrier = f.carrier”. Does it mean the same thing?

The aliased table (f) is used to distinquish between the two times that the flights table is approached for data. Simply put, the query refers to the same table twice:

Once to select ID’s where the distance is greater than…
Once to select the average distance.

The two are combined to create the result.

The query is confusing because multiple ways of working are used (aliased tables and non-aliased tables). Personally I would write the query as follows:

SELECT a.id
FROM flights AS a
WHERE a.distance > (
SELECT AVG(b.distance)
FROM flights as b
WHERE b.carrier = a.carrier);

This creates a far better overview of what is actually done.

Coming back to my earlier explanation:

Once to select ID’s where the distance is greater than… <- is extracted from aliased table a.
Once to select the average distance. <- is extracted from aliased table b.