Multiple Tables with Reddit Q 6

https://www.codecademy.com/paths/analyze-data-with-sql/tracks/analyze-data-sql-join-data/modules/analyze-data-sql-practice-joins/projects/sql-multiple-tables-with-reddit

Hello - I am hoping someone can assist with with Q6. The question being asked if to see only those existing posts where users are still active and to write a query with Inner join and have posts be the left table.

After looking at the question, and looking at the tables, I utilized posts.created_date to determine who is still active - I noticed that there were Null values in these dates and maybe made the incorrect assumption that meant that the user was no longer active.

I also wanted to see what the latest date was, so I used SELECT MAX(created_date) FROM posts; and found 2020-06-16 was the latest date. Based on those two assumptions, I wrote the following query:

SELECT posts.title, strftime(’%Y’, posts.created_date) AS ‘active’ FROM posts

JOIN users

ON posts.user_id = users.id

WHERE posts.created_date IS NOT NULL

ORDER BY posts.created_date DESC;

I came back with titles and dates for any user still posting which I thought was correct, but when I checked the hint, I was really confused. The hint states:

SELECT * FROM ____

INNER JOIN users

ON posts.____ = users._____;

When I filled it in like this, I just got a list of user IDs that are numbered from 1 - 200 in ASC order:

SELECT * FROM posts

INNER JOIN users

ON posts.user_id = users.id;

That didn’t seem right to me, so I am very confused.

Can someone tell me why my query is wrong and why the hint query is right? Thanks!

It’s worth having a look at How do I format code in my posts? which helps keep the nice formatting when posting code on the forums. Makes it much nicer for anyone who reads it :slightly_smiling_face:.

I’m not sure about the right answer tbh, there are a number of queries throughout the course where if you dig around in the data enough you realise that you queries are perhaps catching/removing things they shouldn’t be. I wouldn’t spend too much time worrying about this one.

Due to the nature of the ‘INNER JOIN’ in the hinted solution you shouldn’t actually have all 200 ids (six of them would be removed).

I think your answer should actually be very similar to the solution (the only real difference is the SELECT * vs. your specific values). You might want to compare the two, but I believe they filter out the same things.

Like you I noticed NULLs in a few columns but the ones that stood out to me were those in the users table. I actually wrote my solution much like the example but with an additional clause removing the NULLs in the users too. For reference the INNER JOIN alone reduces the full 200 posts to 194 posts; with the addition of filtering NULLs from users this becomes 188 posts.

This matched up nicely to my interpretation of the instructions which was to remove both deleted posts and existing posts where the user was no longer active. That is, the instructions about-

EXISTING posts where the USERS are still active

But I don’t know if that was the correct interpretation or not, I’d worry more about whether you understood and can implement a useful inner join (a.k.a. have you learnt something from the lessons) than if your query matches exactly what cc expected.

Long story short, I think your filter would work pretty much the same as the hinted solution, but SELECT * is different to your specific selections and ordering steps.

Thank you so much, I appreciate your feedback. I personally don’t like Joins, especially LEFT joins, I always seem to have a problem trying to figure out which should be considered the left table.

But that, I think, is just me. :slight_smile:

1 Like

This topic was automatically closed 41 days after the last reply. New replies are no longer allowed.