I used many methods from the tools codecademy provided in order to run some tests on this exercise. What I came to conclude is:
The example that's provided is just a template, you do not copy it word by word (compare my code below to the example).
The function of
ORDER BYand how they're being used in this exercise.
The debate for the use of dep_date and the different outputs we could get. (identifying this as a bug)
There was also a debate on if the code needs to only post
dep_day_of_week to Friday, but I considered it as a "What if..." condition before the example they provided, and that it's not part of the instructions. I tried to have it only post for Friday regardless, just have a clear mind.
I've seen a few people make the mistake of using the example's
COUNT(*) function. If you read the instructions, it states it wants to know the AVERAGE distance traveled, and have real number stored in flight_distance, yet people do
COUNT(distance) AS flight_distance. The
COUNT(distance) only counts how many rows have a value if it's not
NULL. We don't want to know how many rows it's using, we need the value in the row. So, simply put
distance AS flight_distance so we can average out flight_distance in the OUTER SELECT.
AVG(a.flight_distance) AS average_distance
GROUP BY it'd be best to test yourself and see what the values do to your output. Between
FROM are the columns that have data. For this exercise, when you call
GROUP BY 2 it just arranges the data (in this column) from other identical data into groups. To understand what I mean, go to your exercise page, at the top type
SELECT dep_day_of_week FROM flights LIMIT 20; next, replace
LIMIT 20; to
GROUP BY dep_day_of_week; (or 1 if you don't want to type it out) and notice the difference. So, we now know
GROUP BY gets rid of duplicate values and organizes the column.
In the INNER SELECT, I grab the month (1), days of the week (2), date (3), and distance (4) from the flights table. Stored distance into flight_distance and removed any duplicated values from the months, days of the week, and date. To test the output, I'd copy the INNER SELECT and put it at the top (over your current code) and see what happens. If you remove 1, 2, and/or 3 from
GROUP BY you'll see how the data gets organized. A good example is
SELECT dep_month, dep_day_of_week, dep_date, distance FROM flights GROUP BY 2; you'll see it only shows the only 7 days worth of information.
ORDER BY is self explanatory, just orders the selected columns by ascending or descending order.
FROM ( condition ) a how I see it, the a is a data point, or a variable where it stores the condition and can be called in the OUTER SELECT, it's what bridges the INNER SELECT to the OUTER SELECT. You could use it like this:
SELECT fly.dep_month FROM ( condition ) fly GROUP BY 1; or whatever makes it easier for you.
This is how I'm understanding everything from what I've been taught up until now, so I might be incorrect on a lot of things. I stand by my line of coding, and I know it can be improved, but I believe what I have it what any other beginner would have, and it follows the instructions. If I'm way off and it's completely wrong, then please tell me and explain why it's completely off. Thank you!
Here is what I think the code should be as "passable". Please read my submitted bug report at the bottom of this page to understand why I think this course is bugged and that this code should have worked. Here are the two screen captures I took when trying to justify my points:
dep_date can't organize
(links are from puu.sh, a quick-link screen-capture page)
SELECT a.dep_month, a.dep_day_of_week, AVG(a.flight_distance) AS average_distance FROM ( SELECT dep_month, dep_day_of_week, dep_date, distance AS flight_distance FROM flights GROUP BY 1,2,3 ) a GROUP BY 1,2 ORDER BY 1,2;
Using dep_date as a text will throw off organizing the table, it's data should be varchar(255) like dep_month (http://puu.sh/qF4gL/2b256814c7.png).
Also, even without the use of dep_date to organize the table, I know I'm using the correct code to get the correct output, but I believe since the list isn't organized properly, it won't let me pass the course (http://puu.sh/qF5eE/b67de81cdf.png).
You can clearly see in this screen capture that the dep_month is organized, and every day with data for each month is posted (not duplicated), and the correct values on the average_distance is posted to the corresponding day/month. The only issue here is dep_day_of_week isn't organized, which I believe is the issue on why it won't let me pass this exercise.
I checked the forums for an answer, but it seems like people were stuck and gave up on this course. I find it a bit of surprising that not one person could provide a hint on the solution.