# Why do we need to use the alias "a"?

Hi, I don’t understand the meaning of ‘a.’ in the query

why I have to add this letter to complete the exercize?
“a.dep_month”
“a.dep_day_of_week”
“a.flight_distance”
“a
GROUP BY 1,2
ORDER BY 1,2”

# Non-Correlated Subqueries III

SELECT a.dep_month,
a.dep_day_of_week,
AVG(a.flight_distance) AS average_distance
FROM (
SELECT dep_month,
dep_day_of_week,
dep_date,
sum(distance) AS flight_distance
FROM flights
GROUP BY 1,2,3
) a
GROUP BY 1,2
ORDER BY 1,2;

20 Likes

Please excuse all the edits (I’m a SQL newb) and I could be wrong but I believe the inner query…

SELECT dep_month,
dep_day_of_week,
dep_date,
SUM(distance) AS flight_distance
FROM flights
GROUP BY 1,2,3

…is being set to a variable of “a”. In turn, the outer queries runs a query on the “a” variable or table.

The inner query is used to create an organized table sorted by month, day of the week, and then I believe for cosmetic reasons(?), month date. The ‘flight_distance’ is a sum of all miles flown across all flights for that day in the month.

The outer query queries this cleaned up and organized temporary table ‘a’ and groups the final output by departure month, then all of the Mondays, Tuesdays, Wednesday’s, etc. for that month and then the average flight distances for each of those sets of days of the week.

In the end, we have the average total distance flown by day of week in each month.

To understand what was happening, it was helpful to run the inner query in the example first and see the table results and then add the outer query around it and again, see the results.

15 Likes

Thanks for this. Sheesh, huge jump from Non-Correlated Subqueries 2.

3 Likes

Can someone explain why the ‘a’ is there? I don’t understand why we need it? And can someone explain what the ‘a’ does?

2 Likes

Neutechnica, thanks for the info. I’m also a SQL newb to this, and struggled to understand this setup until I started looking through forums.

There doesn’t seem to be enough lessons covering subqueries or nested queries (though, please let me know if there’s a section I’m overlooking).

danyal1414 - The “a” is being used as the name of the copied table version of “flights”. In the outer query you have:
SELECT a.dep_month,
a.dep_day_of_week,
AVG(a.flight_count) AS average_flights

The “a” is the name of the table and after the period (.) is the name of the column

What I’m still wondering is what is the difference between adding that “a” and using “as xxxx” to, I suppose, alias the entire query. So instead of

SELECT a.dep_month,
a.dep_day_of_week,
AVG(a.flight_distance) AS average_distance
FROM ( … ) a

it would be

SELECT inner_query.dep_month,
inner_query.dep_day_of_week,
AVG(inner_query.flight_distance) AS average_distance
FROM ( … ) as inner_query

Is it just shorthand? Is the “as” superfluous in this case? It seems to work either way. Maybe I missed it, but I feel like this “a” notation was kind of sprung on us in the lesson and not well explained.

1 Like

I agree this was a pretty huge jump. I think this part of the lesson needs to be revised. Unless I am mistaken I don’t think setting variables has ever been discussed in any of the previous SQL lessons (excuse me if I am wrong or have forgotten). I also did not understand the “a.” I only understood this once it was explained here in the forum. I could only recognize this as setting a variable or “temporary table” from prior knowledge of this from programming languages that would set variables.

I was also stumped in how the outter query was able to AVG(a.flight_distance) being that the flight table did not have a “flight_distance” only “distance”. I later noticed it was set AS flight_distance in the subquery. It seems as if everything is set first in the subquery then calculated above in the outter query. Just like when we set the variable “a”.

I think it would be awesome if there could be links to previous lessons that pertain to what is being taught in the advanced lessons. It would help solidify what we have already learned if it has been forgotten and lead us to faster eureka! moments.

7 Likes

I agree. The FROM( … ) a felt totally sprung on us. I had the same thought that we could have temporarily named this AS inner_query. I only understood this notion as setting it to a temporary table because I have some minor experience programming and in programming we sometimes set variables or “temporary” numbers to a letter. In this case we set a temporary table to a letter.

For the solution, the inner query has “Sum(distance)”. However, to find the average distance travelled for a single day, I think using “Sum(distance)” is a mistake because then the outer query will only have 1 distance to average, which is the total distance travelled that day. Shouldn’t we take out the “sum” and simply puts “distance”–in this way we can average the distances in the outer query?

I thought the subquery could be run as it’s own independent query, but when i entered it and tried to run it as its own, just to mentally put the peices together, it kept giving me a syntax error? What’s happening?

Hi guys, as a fellow newb, I spent some time to figure this out.
Here’s my 2 cents.

1- The reason we can’t AVG in one big query is SQL doesn’t accept AVG(COUNT) or AVG(SUM).

Try it, you’ll get a syntax error (wrong use of aggregate function AVG).

Hence to find the average of Total we must run 2 queries:
1st- SUB query for the COUNT() or SUM() – which is basically treated as a table within a table (if that makes it easier to understand).
2nd- Outer query for the AVG (table of SUM()) or AVG (table of COUNT()) and you find your correct answer.

2- Why do we need to use an alias (a. or x. etc) I tried running the same command without an alias, it worked fine.
Tried running a word instead of a letter (alias. instead of a.) for some reason it didn’t work, I suppose there’s a function for alias.

Try it out and let me know if this works for you.

Ali

1 Like

kwanhui1990: the exercise isn’t very well explained, but:

You’re right, for any given day, the inner query only produces a single value: total distance travelled that day. So the first saturday in February will have 1 value. The second Saturday in February will have 1 value. The third Saturday will have 1, and the fourth Saturday will have 1. But all together, that makes 4 values. 4 different values for total distance, across 4 different Saturdays in February.

The outer query would then take the average of those 4 values. The idea is to be able to say that “On an average Saturday in February 2016, the total distance flown by all flights was…”

I don’t know why you’d want to quote a statistic like that, but that’s what they’ve asked for. I think that’s the real problem with this exercise: it’s hard to get your head around what they’re asking for because what they’re asking for is pretty weird.