Question
In the context of this exercise, does the result change if we JOIN
tables in a different order?
Answer
No, when applying JOIN
, also known as INNER JOIN
, the data is joined on wherever there are matching values on a certain column specified by the ON
clause, and regardless of the table order in the JOIN
, the result should be the same. So whether you do
plays JOIN songs
or
songs JOIN plays
the results would be the same.
The only difference would be the order that the columns and rows appear in the result set, which will follow the order of the first, or left, table, but which can be easily modified so that there is no difference, such as by using an ORDER BY
clause.
8 Likes
When will this be different? Left join? Then what should we do when we use left join?
And how about the information mentioned in primary key and foreign key part?
In that lesson, it metioned that we should join as below:
SELECT *
FROM (table with primary key)
JOIN (table with foreign key)
ON …
2 Likes
I believe the response by jephos249 is incorrect. There is a difference in the presentation of the data if you change the order of the JOIN.
If you write the code like this:
plays JOIN songs
They are arranged by date
But
if you write the code:
songs JOIN plays
then the data is arranged by song.
then the songs are listed one after the other showing which user played each song before moving on to the next song.
3 Likes
Good observation, thanks for sharing! Just tested as well and noticed the difference, which I think is actually a very important distinction to note. But at least there’s always the ORDER BY command, which I think is what the OP was getting at in the bit about using ORDER BY to modify how the results are presented.
Yup, good point.
which can be easily modified so that there is no difference, such as by using an ORDER BY
clause.
We could get the same result as the initial solution by writing:
SELECT
plays.user_id,
plays.play_date,
songs.title
FROM songs
JOIN plays
ON plays.song_id =
songs.id
ORDER BY plays.play_date;
It’s a good observation though and I agree it’s an important distinction.
1 Like
As far as I can figure there are the same number of user id as there are song ids, which I’d infer means each user only played one song. GROUP BY seems to confirm this.
Is there a reason that play_date is a necessary column for this query, if the same user played multiple songs on different days, I’d understand how this would be a useful field, but with one user only playing one song each… why is play_date in the query?