Non-Correlated Subqueries III - Why do we need to GROUP BY 1,2,3 instead of just GROUP BY 3?


Hi, I have a question for the “GROUP BY” clause inside subquery a. My understanding is that the subquery is obtaining the total number of flights on each date, which will be used to calculate the average in the outer query. If so, shouldn’t “GROUP BY 3” be sufficient? Why do we have to GROUP BY 1,2,3?

SELECT a.dep_month,
       AVG(a.flight_count) AS average_flights
  FROM (
        SELECT dep_month,
               COUNT(*) AS flight_count
          FROM flights
         **GROUP BY 1,2,3**
       ) a


This is actually a really bad thing to do IMHO, and it’s not supported in most other database platforms.

The reasons people do it:

they’re lazy** - I don’t know why people think their productivity is improved by writing terse code rather than typing for an extra 40 milliseconds to get much more literal code.

The reasons it’s bad:

it’s not self-documenting** - someone is going to have to go parse the SELECT list to figure out the grouping. It would actually be a little more clear in SQL Server, which doesn’t support cowboy who-knows-what-will-happen grouping like MySQL does.

t’s brittle** - someone comes in and changes the SELECT list because the business users wanted a different report output, and now your output is a mess. If you had used column names in the GROUP BY, order in the SELECT list would be irrelevant.


This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.