What does it mean to normalize data?



In the context of this exercise, what does it mean to normalize the data?


In the context of the exercise, normalizing data is adjusting data to a different scale, such that we are able to compare two datasets which might not have been clearly related initially. By normalizing two datasets and then graphing them, it is easier to see their differences or relation.

A way to normalize the values to a new scale is by calculating a ratio value, like so
ratio = total_area / (total of all values)

where this ratio can be multiplied by the original values to give the properly scaled values.
scaled_value = ratio * value

We can use other values besides 1 if we want to change the total shaded area of the histogram.

When we use normed=True, the total_area would be 1 by default.