KNearestNeighbor Breast Cancer Classifier

KNN Breast Cancer Classifier

Hi there

Can anyone clarify a few things for me?:

  1. How to arrive at the optimum ‘random state’ - the attached images below show 3 different random states used for plotting which ‘k’ results in highest prediction accuracy. There doesn’t appear to be a pattern to the peaks of each plot at point ‘k’ when using different random states. For me this makes it difficult to decide which ‘k’ is optimal as the algorithm performs differently for different 'k’s using different random states. Is the point of this exercise that, for this particular dataset, ‘k’ is arbitrary?

Snip1 Snip2 Snip3

  1. The plots show a number of distinct levels of accuracy suggesting that the scale as the plot jumps from one level of accuracy to another. What is the significance of these distinct levels with regards to the way the KNeighborClassifier algorithm works?

Many thanks :slight_smile:
Andy