WITH popular_posts AS (
SELECT *
FROM posts
WHERE score >= 5000
)
SELECT subreddits.name, popular_posts.title, popular_posts.score
FROM subreddits
INNER JOIN popular_posts
ON ??? = ???
ORDER BY popular_posts.score DESC;
The general idea with a properly normalised database is that every table, usually except one should have a foreign key that allows it to link up correctly the the base tables primary key. This means that we can just join on these ID’s and guarantee that we’re getting the right info into the right place. So if we do a quick inspection of the two tables you are trying to join here, subreddits and posts, we see the column names as follows:
subreddits
id
name
created_date
subscriber_count
posts
id
title
user_id
subreddit_id
score
created_date
We can see that in posts, we have a column named subreddit_id. This is the foreign key we require for linking to the subreddits table, and the idea is that if there’s a post in the posts table, and it has a subreddit_id of 7, then this tells us that the subreddit with id of 7 in the subreddits table is where that post came from. So when performing joins, we want to find these common columns and join on those. Therefore our join query would be:
FROM subreddits
INNER JOIN popular_posts
ON subreddits.id = popular_posts.subreddit_id
This tells the SQL query builder if it finds a number 7 in subreddit_id, this corresponds to the number 7 in subreddits.id. And thus the join should perform correctly. This is the first 5 results of that query:
name
title
score
aww
Picture of a kitten
149176
programming
Running DOOM on a toaster
143728
news
Promising advances made toward cure for cancer
136532
programming
Codecademy releases their new database courses
133728
IAmA
I am Gill Bates, founder of Macrohard. Ask me Anything.
96731
We can see that “Picture of a kitten” has matched to “aww” and “Running DOOM on a toaster” has matched to “programming”, so this gives us a good idea our join has performed correctly. It’s not always as easy as that in the real world (I work with medical data and it’s really not that simple without a microbiology background) but in this dataset it would be pretty obvious if there was a mismatch.
Hopefully that explains the join and helps you understand it a bit better, any other questions let me know!
I’d be careful with table names here, is it popular_post or popular_posts (check each instance) and same for ordering (is the posts table actually included in this part of the query or just a temporary table with a different name?).
If you’re posting code to the forums it’s well worth viewing- How do I format code in my posts?. It’ll stop the forums marking up your code (which can mess up alignment and remove certain characters).