So, I was talking to someone with experience with SQL, and they told me that when using
GROUP BY statement, I should refer to aggregated columns by their name, rather than by number.
For example, I should use the first lot of code, not the second:
SELECT imdb_rating, COUNT(name) FROM movies GROUP BY imdb_rating;
SELECT imdb_rating, COUNT(name) FROM movies GROUP BY 1;
They said that it makes the code more readable and maintainable, and avoids issues when the order of columns is changed later.
However, using this convention breaks down when I attempt to solve this puzzle: Https://www.codecademy.com/paths/data-science/tracks/sql-intermediate/modules/dspath-sql-aggregates-projects-and-applications/projects/sql-hackernews
If I wanted to avoid using numbers in the following code using a
CASE block, what should I put the the
GROUP BY clause? If I use
source, it returns all rows as if they matched the
SELECT CASE WHEN url LIKE '%github.com%' THEN 'GitHub' WHEN url LIKE '%medium%' THEN 'Medium' WHEN url LIKE '%nytimes%' THEN 'New York Times' ELSE 'irrelevant' END AS 'source', COUNT (*) FROM hacker_news GROUP BY 1;