This community-built FAQ covers the “Code Challenge 2” exercise from the lesson “Code Challenge: Aggregate Functions”.
Paths and Courses
This exercise can be found in the following Codecademy content:
FAQs on the exercise Code Challenge 2
Join the Discussion. Help a fellow learner on their journey.
Ask or answer a question about this exercise by clicking reply () below!
Agree with a comment or answer? Like () to up-vote the contribution!
Need broader help or resources? Head here.
Looking for motivation to keep learning? Join our wider discussions.
Learn more about how to use this guide.
Found a bug? Report it!
Have a question about your account or billing? Reach out to our customer support team!
None of the above? Find out where to ask other questions here!
What is the count variable and why is it a 2 in this case?
COUNT(*) AS ‘count’ is the second column in the output table.
We want to sort base on popularity, so we sort based on the count, descending order.
To do this, we use ORDER BY. And now we need to reference the second column, which has all the count.
We can reference it with the number 2 - second column. Try to replace this with 1, and you will see it is sorted alphabetically because 1 refers to the first output column, or the names.
Also try to replace 2 with ‘count’ (without the quotes) - it still works. This is because the column is named ‘count’. Remember the first line, AS ‘count’. This is the name of the column.
Hope this helps.
Edit: It is called a column reference. Read more online.
As long as each ‘first_name’ belongs to a real user and at the same time each user in the table ‘users’ has a unique id, why in the solution of the exercise we don’t simply write :
COUNT (id) FROM users instead of COUNT(*) AS ‘count’ FROM users ?
I suppose it would be simplier and more precise. Moreover I think that actually, this is the usefullness of the ‘id’ parameter , to count the number of users, no matter what the statement is.
Hi, when we are told:
What are the most popular first names on Codeflix?
GROUP BY , and
ORDER BY to create a list of first names and occurrences within the
Order the data so that the most popular names are displayed first.
SELECT COUNT(first_name), first_name
GROUP BY first_name
ORDER BY first_name DESC;
But the answer is:
SELECT first_name, COUNT(*) AS 'count'
GROUP BY first_name
ORDER BY 2 DESC;
Is there something wrong with what I did?
As discussed in this Stackoverflow thread on “In SQL, what’s the difference between count(column) and count(*)?”,
COUNT(*) will include rows with
NULL values. Just in case
NULL values (it shouldn’t, but it could happen due to the mistake in data entries), using
COUNT(*) is safer. That’s my humble opinion.
How so? I think the business logic is going to determine which count operation to use.
What I mean is that
COUNT(*) always gives us the total number of rows in a table (correct me if I’m wrong) while
COUNT(id) may not (where
id is supposed to be the primary key).
Whenever we get a new dataset, we should first check if both
COUNT(id) give the same number of rows. If not,
id is not really the primary key. Check the data codebook and/or assign unique id numbers to the rows where
Let me know if I miss something here. I’m eager to learn from a different viewpoint.
lets say we want the number of people who own a car, so lets say we have column named
car, which is nullable, we can do:
select count(car) from persons;
this way, we get how many people own a car.
my point, we can do more with count then just
id. That is what you are missing.
That’s true. Thank you for your answer!
and then we haven’t covered relations yet.
this answered my question perfectly. Thanks. so it is Order BY the second column in the new table not the original table. Thanks again
What’s the difference between using DISTINCT and GROUP BY?
Why is it incorrect if we count the first name instead of all columns?
This has already been covered in this topic? Like here: