Does the graph of k always follow this shape?

Question

In the context of this lesson, does the graph of k for any dataset follow this shape?

Answer

In general, yes, any dataset should follow a similar shape as the one shown, although it may appear slightly different.

The general trend that is shown in a k vs validation accuracy graph will be as follows.

For small values of k, the accuracy will be low, because the model will overfit the data.

As k increases, accuracy will also increase, until eventually reaching a sort of “hump” shape, where the best value of k will be between. In this particular graph, this happens around the value k=74, where the validation accuracy is highest.

After this “hump”, the accuracy will continue to drop, as k increases further, and underfitting occurs due to high k values.

15 Likes

Is there a way to improve the accuracy beyond the value obtained at k=74?

1 Like

From my impression of K clustering, you could increase it by adding more inputs (i.e. more information about the data set) the more you know about the data, the more likely you are to be spot on with fewer data points.

2 Likes

How coud I code this graph for my own dataset?

1 Like

You will have to use a for loop to obtain the values of accuracy of different k. Save those accuracy values and then plot them against the values of k. You can use matplotlib to do this.

1 Like