What if the minimum and maximum values are the same?

Question

This exercise explained how we could use min-max normalization so that different features are weighed similarly. What happens if the minimum and maximum for some feature are the same?

Answer

It is unlikely, but possible, that the minimum and maximum values for some feature are the same value. In that case, our calculation for normalization would fail, due to division by zero.

(value-minimum) / (maximum-minimum)

To account for this possibility, one thing you can do is skip the calculation, and instead set all the values of that feature to the same value, say 0 or 1, for each data point. This way, they will all be weighed the same.

However, we may determine that when all values are the same, then this does not provide any useful information to us. So, we might also consider excluding that feature entirely. For example, say that we had a dataset for animal physical features and that every animal in our dataset had two legs. Since we know that each animal has two legs, then we might exclude that feature in our calculations.

17 Likes

Let’s say we have a dataset with 4 different parameters as below.

parameter 1: ranges from 0 to 1,000,000
parameter 2: ranges from -5000 to 5000
parameter 3: either 0 or 1 (binary)
parameter 4: ranges between 0 to 1 (non binary)

I see that normalizing them all to have one common scale (i.e. from 0 to 1) is the way to go, however, it doesn’t keep the weight of the original values. I mean parameters 3 & 4 are already within the scale and the other two parameters are down-scaled to the same scale of parameters 3 & 4. Then, doesn’t it make an issue for the value of the data, especially when comparing parameter 1 to parameter 4?
Does it matter at all to somehow retain the greatness of the values within a parameter (i.e. param 1) while normalizing?

Or in other words, how the scale (and its range) should be selected to properly represent the parameters? Does it even matter?