How is this different from the gradient descent for the intercept?


#1

Question

In the context of this exercise, how is the equation for gradient descent of the slope different from the gradient descent of the intercept?

Answer

The main difference between the two is that for the gradient descent for slope, we have an additional factor of x_i for each point of the graph.

The reason for this is due to the derivation of the error function, from which we obtain both gradient descents.

The error function, which we sought to minimize, was as follows:
ErrorFunction

To obtain the gradient descent for the intercept, we calculate the partial derivative of the error function with respect to b, resulting in:
GradDescForIntercept

To obtain the gradient descent for the slope, we calculate the partial derivative of the error function with respect to m instead, resulting in:
GradDescForSlope
which has the additional factor of x_i at each point.