This community-built FAQ covers the “Gradient Descent for Intercept” exercise from the lesson “Linear Regression”.
Paths and Courses
This exercise can be found in the following Codecademy content:
FAQs on the exercise Gradient Descent for Intercept
Join the Discussion. Help a fellow learner on their journey.
Ask or answer a question about this exercise by clicking reply ( ) below!
Agree with a comment or answer?
Like ( ) to up-vote the contribution!
broader help or resources? Head here.
Looking for motivation to keep learning? Join our
. wider discussions
about how to use this guide. Learn more
bug? Report it!
Have a question about your account or billing? Reach out to our
customer support team!
None of the above? Find out where to ask other questions
December 23, 2018, 11:13am
" It is not crucial to understand how we arrive at the gradient equation. "
ok now I’m curious: how did we arrive at this gradient equation?
Same here. It would be nice to have a link or reference to dig a bit deeper.
I am curious as to why we calculate the difference as diff * -2/N
why wouldn’t we calculate it as simply diff/N?
April 18, 2020, 9:11am
Me too, I’m curious why we have to multiply the difference to -2/N. Can anyone answer this? Thanks.
On step 2, why do we have len(x) to define the range of the loop ? I don’t understand the use of the function len() with simple input variables.
On step 3, it’s mentioned that x is an object, could it be the reason why we are allowed to use len(x) since the x variable is an object not an integer ?
Here we are assuming that x is an object such as a list or an array. In this case
len(x) returns the number of elements in the list or array x. In general, the
len() function can be applied to a wide variety of objects including string, dictionary, etc.
sorry, I am very new to this: what determines the current gradient guess and current intercept guess?
like why are they both zero later in the code, and should they always be zero?
November 4, 2020, 4:21pm
Yet another question about
-2/N. Why we multiply it by -2?
The line till which i can understand is that we have a sum of losses from
y - (m*x+b) and if we divide it by N(number of values that we added as sum) we will get average loss.
So what does the multiplication by -2 and then division by N do?
2 comes from derivative. f(x) = x^2 f’(x) = 2x. This is gradient, it points the max y in our graph, but we need gradient descent, the min point of our loss function. - change the direction of gradient at 180 degrees.