Can anyone explain to me what does this mean?

I happened to get it done correctly and understand “how to do it” but i do not understand what is the purpose of it regarding the question the number of watch events falling into this “bucket”

Can anyone kindly explain to me what is the reason behind the purpose of writing this code? why do we use “count(8)” instead of count(id) itself?

https://www.codecademy.com/paths/data-science/tracks/sql-intermediate/modules/dspath-sql-aggregates/lessons/codeflix/exercises/count3

Let’s take a look at the prompt again:

image

It says the UX Research team wants to see the distribution of the watch durations. Specifically they want you to break up the data into “buckets”, each bucket representing a watch duration to the closest minute.

For each of these buckets they want to know the number of watch events that fall into that bucket.

So essentially they are asking for a histogram, which will shows the distribution and shape of a dataset. Each “bucket” would be a bin in the histogram, and the count in each bucket would be the height of the bin. The only difference here is it will be in table form instead of in graph form.

As for using COUNT(*), we want to count the number of rows that have that duration. COUNT(*) will count the rows in the table, while COUNT(id) would count the ids. Either way we get the same result here, so if you chose to count id, that is perfectly valid.