How are the frequency values calculated for the histogram?

Question

In the context of this exercise, how are the frequency values calculated for the histogram?

Answer

For the histogram that you made in this exercise, it is normalized, meaning that the total area under the histogram will add up to 1. With this in mind, we can figure out how the frequencies, or y-axis values, were calculated by doing the following:

First, we need to know the actual width, or range of values, of each individual “bin” of the histogram. The bin width is calculated by the difference of the maximum value minus the minimum value of the dataset, divided by the number of bins we chose. The equation can be seen as follows

bin_width = (max_value - min_value) / number of bins

Next, we just need to ensure that the total area of all the bins added together sums to 1, since this is a normalized histogram. We can do this by adding the area of each bin together, such that the sum is equal to 1. Also, one additional value we now need is a “ratio” such that it is ensured that the total area adds up to 1, due to normalization. We will see this come into play later on.

The equation can be seen as follows, where the count variables are the number of values that fall in each bin.

1 = (bin_width * ratio * count_1) + (bin_width * ratio * count_2) + ...

We can simplify this to

1 = bin_width * ratio * (count_1 + count_2 + ...)

and then shorten it to this,

1 = bin_width * ratio * N

where N is the total number of elements in the dataset.

Finally, we must obtain this ratio value.

1 = bin_width * ratio * N

1 / bin_width = ratio * N

1 / (bin_width * N) = ratio

Now that we have the ratio, we can obtain the frequencies for each bin. To get the frequency, or y-axis value, of a bin, you would use the following equation,

frequency = items_in_bin * ratio
7 Likes

Hi team,

I have a small question regarding the frequency equation stated above. What does the items_in_bin refers to? When I read this post from top to bottom, I get disconnected when I reach this particular equation.

Explanation with a simple example would be very much appreciated.

Thanks,
Jimmy

We will need a link to the exercise for context. Please post it in a reply.

Hi there,

I dont think this question is linked to any exercise. This is just a general question about finding the frequency value (y axis) on histogram charts.

I guess one way to find out is to track down that exercise and see what its about. Could answer your question just in the doing.

Hi there,

the link is actually attached on the original post (first line) by jephos249.

I’m not the one that needs to follow up.

Well, I guess I will just have to wait for other Codecademy moderators or users to answer my question.

You’ve already said it was a general question but related it directly to a course. How is anyone who hasn’t taken the course going to be able to answer your question, especially if you haven’t taken the course, either? This is a non-question.

What does items_in_bin specify in the formula :frequency = items_in_bin * ratio.
Can anybody explain the formula stepwise with an example, if possible?

Thanks in advance!!