6. Correlated Subqueries 2 Example - question about syntax

<PLEASE USE THIS TEMPLATE TO HELP YOU CREATE A GREAT POST!>

<Below this line, add a link to the EXACT exercise that you are stuck at.>
https://www.codecademy.com/en/courses/sql-table-transformation/lessons/subqueries/exercises/correlated-ii

<Below this line, in what way does your code behave incorrectly? Include ALL error messages.>
The example states that we want to order flights by giving them a sequence number based on time, by carrier.

The example syntax is:

<In this course, it often helps to include a screenshot of your whole web browser – that lets everyone see what you see. If you wish to include a screenshot, add it below this line.>

<If you wish to copy/paste in your code, you can use this next section. This will allow others to copy/paste your code for testing – something that they won’t be able to do with just a screenshot.>

```

SELECT carrier, id,
(SELECT COUNT (*)
FROM flights f
WHERE f.id < flights.id
AND f.carrier=flights.carrier) + 1
AS flight_sequence_number
FROM flights;

<do not remove the three backticks above>
My question is - why do we want f.id to be less than flights.id? Why not equal to? If I change it to

...
WHERE f.id = flights.id
...
the flight_sequence_number is identical between all observations. Why is this? I can't wrap my head around why f.id needs to be less than flights.id. Thanks in advance for any help
1 Like

Subscribing to this topic…
Maybe someone will be so kind to help us.
I actually cannot understand even how this sub-query +1 gets flight_sequence_number
(SELECT COUNT (*) FROM flights f WHERE f.id < flights.id AND f.carrier=flights.carrier) + 1

Subscribing to this topic…
Maybe someone will be so kind to help us.
I actually cannot understand even how this sub-query +1 gets flight_sequence_number
(SELECT COUNT (*)
FROM flights f
WHERE f.id < flights.id
AND f.carrier=flights.carrier) + 1

To answer this you have just to remove the +1.

You will see that the sequence number of some flights is equal to 0. This is probably an increment to avoid having 0 for the flights which doesn’t fill-up the condition:

WHERE f.id < flights.id
AND f.carrier=flights.carrier

While the condition

WHERE f.id < flights.id

is just used to filter the flights that have less distance than average… the operator (<) is used like (<>)
but since ‘id’ is unique you can just filter using (<)…
Why they filter ? I don’t know :stuck_out_tongue:

I am trying to understand too, I hope this would help and correct me if I am wrong.

1 Like

As I understand it (and it took me a while to get my head around it):

The f.carrier=flights.carrier filters for all the flights by the same carrier as the row of interest, so far so good. For the row containing the first (or only) flight by this carrier, there will be zero f.id which are less than this flights.id, so count + 1 = 1. For the second flight by this carrier, there will be one f.id which is less than this flights.id, making count + 1 = 2. For the third, 3 etc.

I hope that helps!

George

6 Likes

Yeah that makes sense,
Thanks pogeo :slight_smile:

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.