- Let’s start by examining the three tables.
Write queries to select the rows from each table. Because some of the tables have many rows, select just the first 10 rows from each.
What are the column names of each table?
What is the primary key for each table? Can you identify any foreign keys?
Write a query to count how many different subreddits there are.
Write a few more queries to figure out the following information:
- What user has the highest
- What post has the highest
- What are the top 5 subreddits with the highest `subscriber_count.
Now let’s join the data from the different tables to find out some more information.
Use a LEFT JOIN with the users and posts tables to find out how many posts each user has made. Have the users table as the left table and order the data by the number of posts in descending order.
Over time, posts may be removed and users might delete their accounts.
We only want to see existing posts where the users are still active, so use an INNER JOIN to write a query to get these posts. Have the posts table as the left table.
- Some new posts have been added to Reddit!
Stack the new posts2 table under the existing posts table to see them.
- Now you need to find out which subreddits have the most popular posts. We’ll say that a post is popular if it has a score of at least 5000. We’ll do this using a WITH and a JOIN.
First, you’ll need to create the temporary table that we’ll nest in the WITH clause by writing a query to select all the posts that have a score of at least 5000.
Next, place the previous query within a WITH clause, and alias this table as popular_posts.
Finally, utilize an INNER JOIN to join this table with the subreddits table, with subreddits as the left table. Select the subreddit name, the title and score of each post, and order the results by each popular post’s score in descending order.
- Next, you need to find out the highest scoring post for each subreddit.
Do this by using an INNER JOIN, with posts as the left table.
- Finally, you need to write a query to calculate the average score of all the posts for each subreddit.
Consider utilizing a JOIN, AVG, and GROUP BY to accomplish this, with the subreddits table as the left table.