# FAQ: Code Challenge: Aggregate Functions - Code Challenge 2

This community-built FAQ covers the “Code Challenge 2” exercise from the lesson “Code Challenge: Aggregate Functions”.

Paths and Courses
This exercise can be found in the following Codecademy content:

## Join the Discussion. Help a fellow learner on their journey.

Agree with a comment or answer? Like () to up-vote the contribution!

Found a bug? Report it!

Have a question about your account or billing? Reach out to our customer support team!

None of the above? Find out where to ask other questions here!

What is the count variable and why is it a 2 in this case?

4 Likes

COUNT(*) AS ‘count’ is the second column in the output table.
We want to sort base on popularity, so we sort based on the count, descending order.
To do this, we use ORDER BY. And now we need to reference the second column, which has all the count.
We can reference it with the number 2 - second column. Try to replace this with 1, and you will see it is sorted alphabetically because 1 refers to the first output column, or the names.
Also try to replace 2 with ‘count’ (without the quotes) - it still works. This is because the column is named ‘count’. Remember the first line, AS ‘count’. This is the name of the column.
Hope this helps.

Edit: It is called a column reference. Read more online.

3 Likes

As long as each ‘first_name’ belongs to a real user and at the same time each user in the table ‘users’ has a unique id, why in the solution of the exercise we don’t simply write :

COUNT (id) FROM users instead of COUNT(*) AS ‘count’ FROM users ?

I suppose it would be simplier and more precise. Moreover I think that actually, this is the usefullness of the ‘id’ parameter , to count the number of users, no matter what the statement is.

1 Like

Hi, when we are told:
What are the most popular first names on Codeflix?

Use `COUNT()` , `GROUP BY` , and `ORDER BY` to create a list of first names and occurrences within the `users` table.

Order the data so that the most popular names are displayed first.

I did:

``````SELECT COUNT(first_name), first_name
FROM users
GROUP BY first_name
ORDER BY first_name DESC;
``````

``````SELECT first_name, COUNT(*) AS 'count'
FROM users
GROUP BY first_name
ORDER BY 2 DESC;
``````

Is there something wrong with what I did?

there seems to be a difference:

https://stackoverflow.com/questions/3003457/count-vs-countcolumn-name-which-is-more-correct

but i don’t think you have NULL values.

3 Likes

As discussed in this Stackoverflow thread on “In SQL, what’s the difference between count(column) and count(*)?”, `COUNT(*)` will include rows with `NULL` values. Just in case `id` contains `NULL` values (it shouldn’t, but it could happen due to the mistake in data entries), using `COUNT(*)` is safer. That’s my humble opinion.

How so? I think the business logic is going to determine which count operation to use.

What I mean is that `COUNT(*)` always gives us the total number of rows in a table (correct me if I’m wrong) while `COUNT(id)` may not (where `id` is supposed to be the primary key).

Whenever we get a new dataset, we should first check if both `COUNT(*)` and `COUNT(id)` give the same number of rows. If not, `id` is not really the primary key. Check the data codebook and/or assign unique id numbers to the rows where `id` is `NULL`.

Let me know if I miss something here. I’m eager to learn from a different viewpoint.

lets say we want the number of people who own a car, so lets say we have column named `car`, which is nullable, we can do:

``````select count(car) from persons;
``````

this way, we get how many people own a car.

my point, we can do more with count then just `*` or `id`. That is what you are missing.

1 Like

1 Like

and then we haven’t covered relations yet.

this answered my question perfectly. Thanks. so it is Order BY the second column in the new table not the original table. Thanks again

What’s the difference between using DISTINCT and GROUP BY?

There certainly are differences:

https://stackoverflow.com/questions/164319/is-there-any-difference-between-group-by-and-distinct

Why is it incorrect if we count the first name instead of all columns?

This has already been covered in this topic? Like here:

and here: