If, for legibility, it’s nice to capitalize statements (SELECT, ROUND, etc.), why would one want to replace a column reference with a number? Doesn’t this make it harder to read?
e.g.
SELECT category,
price,
AVG(downloads)
FROM fake_apps GROUP BY category, price; becomes
SELECT category,
price,
AVG(downloads)
FROM fake_apps GROUP BY 1, 2;
gafanhoto1990, I believe that the round in “GROUP BY ROUND(imdb_rating)” is not to round but to refer to the exact same name of that column.
As an alternative, you could add “AS ‘Avg.’” in the select, in which case you would use only “GROUP BY ‘Avg.’”.
But I do have a question:
How to ROUND to the tens or hundreds. I understand that rounding to the 1st decimal would be ROUND(column,1), but why I cannot use ROUND(column,-1) to round to the tens?
I want to make sure I understand this statement correctly:
SELECT category,
price,
AVG(downloads)
FROM fake_apps
GROUP BY category, price;
The part I’m unsure about is the AVG(downloads). If I delete the GROUP BY from this statement, the result is just one line:
category
price
AVG(downloads)
News
0.0
16613.8
That makes sense about the AVG(downloads) value, it’s the average of all the apps in the table - but why, of all the prices, is 0.0 displayed, and why “News” of all the categories?
With the GROUP BY, the value for AVG(downloads) changes: when grouped by 1, 2, all the price groups of all categories are shown. The AVG refers to each price group out of each category. When grouped just by 1, the value changes to reflect the average of each category.
What I’m concerned about is how the syntax works: GROUP BY actually seems to change what the argument of AVG() is. Does anyone else feel this is a weird kind of a jump?
Wouldn’t it make more sense to have price groups as arguments of AVG()?
Came over from the Code Challenge 5, where it seems quite a few of us tried to combine HAVING function with column reference and a sum function. Tought this lesson section may have an answer.
Question: Can column references only be used in GROUP BYs?
… because this does not work:
SELECT user_id, SUM(watch_duration_in_minutes) AS ‘duration’
FROM watch_history
GROUP BY 1
HAVING 2 > 400;