FAQ: Subqueries - Non-Correlated Subqueries III

This community-built FAQ covers the “Non-Correlated Subqueries III” exercise from the lesson “Subqueries”.

Paths and Courses
This exercise can be found in the following Codecademy content:

SQL: Table Transformation

FAQs on the exercise Non-Correlated Subqueries III

Join the Discussion. Help a fellow learner on their journey.

Ask or answer a question about this exercise by clicking reply (reply) below!

Agree with a comment or answer? Like (like) to up-vote the contribution!

Need broader help or resources? Head here.

Looking for motivation to keep learning? Join our wider discussions.

Learn more about how to use this guide.

Found a bug? Report it!

Have a question about your account or billing? Reach out to our customer support team!

None of the above? Find out where to ask other questions here!

11 posts were split to a new topic: Why do we need to use the alias “a”?

4 posts were split to a new topic: Why do I need a nested query at all?

3 posts were split to a new topic: Getting off track in SQL lesson

codecademy example:

     SELECT a.dep_month,
           a.dep_day_of_week,
           AVG(a.flight_count) AS average_flights
      FROM (
            SELECT dep_month,
                  dep_day_of_week,
                   dep_date,
                   COUNT(*) AS flight_count
              FROM flights
             GROUP BY 1,2,3
           ) a
     GROUP BY 1,2
     ORDER BY 1,2;

     The inner query provides the count of flights by day, and the outer query uses the inner query’s result set to compute the average by day of week of a given month.

instructions:
Using a subquery, find the average total distance flown by day of week and month.
Be sure to alias the outer query as average_distance and the inner query as flight_distance.

solution:

    SELECT a.dep_month,
           a.dep_day_of_week,
           AVG(a.flight_distance) AS average_distance
      FROM (
            SELECT dep_month,
                  dep_day_of_week,
                   dep_date,
                   sum(distance) AS flight_distance
              FROM flights
             GROUP BY 1,2,3
           ) a
     GROUP BY 1,2
     ORDER BY 1,2;

my question: In the inner query, why do I need to use
SUM(distance) AS flight_distance
instead of
COUNT() AS flight_distance
like what was used in the example?
I thought I had it done right, but turns out I needed to change the COUNT(
) to SUM() and not sure why?
Why would count work for the number of flights but not the number of miles/distance?
Not sure if I’m having trouble understanding the code or the math …

1 Like

The COUNT function is counting the instances of a record and the SUM function is actually adding values from a column.

In the example code the COUNT function is like a tally board counting up all the instances of flights per day.

In the solution code the SUM function is reading the values of the distance field in the flights table and adding them together.

1 Like

I don’t get why dep_date is needed. I see it won’t let me run the query without it, but I don’t understand why. Anyone?

1 Like

From the code the lesson provides, we are looking for average flights on day of week, like Monday, Tuesday, Wednesday,…etc. However, for example, it may not has same amount of flights on every Monday. Therefore, we count the flights on every day basis, like 2010-01-01, 2010-01-09, 2010-01-10,…etc, in the inner query. You can actually run only the inner query to see that. Then group them up by day of week and calculate the average at outer query. The final result is Monday, Tuesday,…, Sunday in each month in average, and you can see some of them are not integer because of the amount of flights is not the same.

Now, in the practice, we are looking for average distance. We need to sum up the distance of flights on each day instead of counting how many flights we have. Therefore, we simply use Sum(distance) at inner query. Then group them up by day of week and calculate the average at outer query. The structure and concept are basically the same.

2 Likes

How does SQL recognize “a” as a table when the only tables in the schema are “flights” and “airports”. A WITH_AS statement was never created.

2/23/2000 shows two flights on Wednesday, 1 flight on Tuesday. Ha, wat?

This exercise was confusing at first because the flight_count was so low. Why is there an average of only 1 flight per day?? I wasn’t sure I understood what the data meant at first.

I had the same question when I ran the inner query alone. maybe its only tiny wilderness air strips…

Why do we group by 1,2,3 in the subquery? Aren’t we just finding the total distance flown each day in the inner query? Why doesn’t simply grouping by date work?

Can i ask about the ‘Group by’?
I have just learn about grouping by a column and don’t know about multicolumns yet.
Can anyone explain more?
p/s: i can find it by myself but i want to put a question here for the learners after can understand the content more. Thanks