Subqueries 7. Generalizations - Can get correct answer, but don't understand logic


#1


I'm having difficulty understanding the answer to the last exercise in learning Subqueries. I don't really understand the logic behind how they formulated the sequence number. Don't get me wrong, I understand the syntax behind flight_sequence_number's assignment, but I don't understand the logic behind actually calculating the number.


Link to exercise: https://www.codecademy.com/courses/sql-table-transformation/lessons/subqueries/exercises/subqueries-generalizations?action=lesson_resume

My questions are:

  1. Why do we want to know the number of flights [who's outer ID is less than the inner's ID] and [who's outer and inner origins are the same]?

  2. Why do we add 1 to this number that we've counted?

  3. Why must the outer ID be less than the inner ID?

  4. The description says that the sequence number is based on time. In what way does this computation relate to 'time'?


Code in question:

SELECT COUNT(*)
FROM flights f
WHERE f.id < flights.id
AND f.carrier=flights.carrier) + 1 AS flight_sequence_number


#2

It helps if we view it ordered by id.

SELECT origin, id,
    (SELECT COUNT(*)
FROM flights f
WHERE f.id < flights.id
AND f.origin = flights.origin) + 1
 AS flight_sequence_number
FROM flights
order by 2;

Now we can see the actual sequence. This is one addend in the sum:

(SELECT COUNT(*)
FROM flights f
WHERE f.id < flights.id
AND f.origin = flights.origin)

and the other is 1. The first new sequence number to appear in the table is Dallas/Fort Worth, then Miami, and Cleveland, &c. It is the individual origins count that is incremented. Scroll though the query table to see this play out.

The author wishes for us to see how correlated subqueries occur in the iteration of the flights table and are inserted in that row. The subquery is akin to a subroutine called on each new row. Unlike uncorrelated subqueries that build a new table first, then apply the query to that table. Get the difference?


#3

Thanks for the reply mtf.

I think I understand now. Your advice of looking at it while ordered by ID definitely helped, but I found ordering by origin first, then ID is what started to make it click. I can see what you mean but referring to the inner sequence as a 'subroutine'. Basically, every time we move to a new row, what we're looking at a new outer ID, but that outer ID that we're using for our outer query is not the literal exact number in f.id in the inner query. Rather, what's happening in the inner query is that for every origin, it looks at an inner flight ID (flights.id), and then counts all of the planes before it by looking at all possible outer flight IDs using f.id . The ID itself is determines its order in the sequence, where higher numbers go later, so that's why it's doing f.id < flights.id. It add's 1 to this count because the first plane leaving each origin would be 1 in the sequence, not 0.

Thanks, with your help and working through it tiny piece by tiny piece, I understand how flight_sequence_number was calculated :slight_smile:


#4

I knew you would reckon this.

And, I knew you would intentionally order by 1, 2 out of curiosity. The important thing is as you have observed, later flights have higher ids, regardless their origin. The ids have no direct relation beyond the point along the arrow of time they occur.

The outer query provides orgins and ids. As ids are numerically increasing, they are not the principal object. The origin is since that is what we need to keep a count of.

With this stuff still being so new and unpracticed in my own case, wrapping one's head around such straight forward declarative logic is mind boggling some times. I'm still puzzling over a few other problems, in my own time.

You know that you can comment out your old queries and keep submitting new ones, eh? That's what I do when these questions come up. Fortunately this track doesn't eat our code when we're done.


#5

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.