How is inertia calculated?


#1

Question

In the context of this exercise, how is inertia calculated?

Answer

According to the documentation for the KMeans method, the inertia_ attribute is the sum of squared distances of the samples to their nearest centroids.

So, to obtain the value of the inertia, we would obtain each data points’ distance to its nearest centroid, square this distance, and then sum them all together, which gives us the inertia. We will utilize the Euclidean, or geometric, distance formula to calculate this.

The following is a general overview of how we might calculate the inertia.

inertia = 0

for datapoint in dataset:
  # Obtain the nearest centroid of the point.
  centroid = get_centroid(datapoint)

  # Calculate the distance from each datapoint to its centroid
  # using the Euclidean distance formula.
  delta_x = datapoint.x - centroid.x
  delta_y = datapoint.y - centroid.y
  distance = (delta_x ** 2 + delta_y ** 2) ** 0.5

  # Square the distance, and add to the inertia.
  squared_distance = distance ** 2
  inertia += squared_distance