Question
In this exercise, each dimension contributes equally to the classification of a point. Can we have certain dimensions have more significance, or contribute more, than others?
Answer
Yes, this is possible. To do this, you can use a variation of K-Nearest Neighbors, known as Weighted K-Nearest Neighbors. The idea is that when calculating the distance to each point, we can set constants that will multiply with each dimension’s value differently.
This post will not go into depth on the methods of calculation, which can be implemented on your own, but the way that it works will be shown in the following example.
In this example, we have k = 3
, such that a random point will be classified to whatever the majority of the 3 nearest neighbor points classify as. This image shows the original K-Nearest Neighbors classification without weights. It is classified as green because there are 2 diamonds and 1 star for its 3 nearest neighbors.
By introducing weights in the Weighted K-Nearest Neighbors variation, we could end up with something different. In the following graph, each point has a weight calculated based on the importance of each dimension, where each dimension does not contribute equally. Although there are 2 diamonds and 1 star in the 3 nearest neighbors, the greatest weight is toward the yellow classification.