Hi and welcome to the forums!
The general idea with a properly normalised database is that every table, usually except one should have a foreign key that allows it to link up correctly the the base tables primary key. This means that we can just join on these ID’s and guarantee that we’re getting the right info into the right place. So if we do a quick inspection of the two tables you are trying to join here,
posts, we see the column names as follows:
We can see that in posts, we have a column named
subreddit_id. This is the foreign key we require for linking to the subreddits table, and the idea is that if there’s a post in the posts table, and it has a subreddit_id of 7, then this tells us that the subreddit with id of 7 in the subreddits table is where that post came from. So when performing joins, we want to find these common columns and join on those. Therefore our join query would be:
INNER JOIN popular_posts
ON subreddits.id = popular_posts.subreddit_id
This tells the SQL query builder if it finds a number 7 in subreddit_id, this corresponds to the number 7 in
subreddits.id. And thus the join should perform correctly. This is the first 5 results of that query:
||Picture of a kitten
||Running DOOM on a toaster
||Promising advances made toward cure for cancer
||Codecademy releases their new database courses
||I am Gill Bates, founder of Macrohard. Ask me Anything.
We can see that “Picture of a kitten” has matched to “aww” and “Running DOOM on a toaster” has matched to “programming”, so this gives us a good idea our join has performed correctly. It’s not always as easy as that in the real world (I work with medical data and it’s really not that simple without a microbiology background) but in this dataset it would be pretty obvious if there was a mismatch.
Hopefully that explains the join and helps you understand it a bit better, any other questions let me know!