Reggie's Linear Regression Project


My question is regarding my code vs the solution code, mainly " x_point, y_point = point":

My solution:

def calculate_error(m, b, point):
    point = [x, y]
    x_point = point[x]
    y_point = point[y]
    yvalue = get_y(m, b, x_point)
    return abs(yvalue - y_point)

Given solution:

def calculate_error(m, b, point):
  x_point, y_point = point
  y = m*x_point + b
  distance = abs(y - y_point)
  return distance
  1. I was reading other threads about how the x_point, y_point = point is a tuple. But I still dont understand this x_point, y_point = point

  2. Is my solution wrong that I put point as a list?

Edit: … I got stuck on the second part of the question… it kept telling me x was not defined in my first solution above point [x,y], I’m a bit confused between tuples and lists.

def calculate_all_error(m, b, points):
    total = 0
    for point in points:
        indv_error = calculate_error(m, b, point)
        total += indv_error
    return total
datapoints = [(1, 1), (3, 3), (5, 5), (-1, -1)]
print(calculate_all_error(1, 0, datapoints)) 

Where are x and y variable defined?

You do some extra steps

Its known as unpacking, now you know what is called you can always use google to find more information :wink:

unpacking works for list as well:


so i created a list, which i unpacked in two variables.

Like your tuple we could do:

x_point = point[0]
y_point = point[1]

x_point, y_point = point can in that regard be considered a shortcut.

Wrong? No. Maybe slightly more complex then needed.


tuples and lists work based on index, like i showed here:

x_point = point[0]
y_point = point[1]

so using x and y like you want to, is wrong indeed

your assignment is in the wrong direction, what should be assigned to what?
you’ve also got too many variables, you have several names for the same thing.

I would not have been able to do this project without looking at the solutions :[

Now I’m running into another problem:

datapoints = [(1, 2), (2, 0), (3, 4), (4, 4), (5, 3)]
smallest_error = float("inf")
best_m = 0
best_b = 0

for m in possible_ms:
    for b in possible_bs:
        if calculate_all_error(m, b, datapoints) < smallest_error:
            best_m = m
            best_b = b
            smallest_error = calculate_all_error(m, b, datapoints)

print(m, b, smallest_error)

At first I didn’t know I had to include “datapoints” in this cell

Also my output is: 0.99 1.99 4.999999999999999

while the solution seems to be the same as mine:

datapoints = [(1, 2), (2, 0), (3, 4), (4, 4), (5, 3)]
smallest_error = float("inf")
best_m = 0
best_b = 0

for m in possible_ms:
    for b in possible_bs:
   	 error = calculate_all_error(m, b, datapoints)
   	 if error < smallest_error:
   		 best_m = m
   		 best_b = b
   		 smallest_error = error
print(best_m, best_b, smallest_error)

but their output is: 0.30000000000000004 1.7000000000000002 4.999999999999999

Is there some reason for mine to be different than theirs? It seems like my code is just about the same as the solutions.

But if you constantly have to look at the solutions, your chasing down a rabbit hole.

you need to start solving problems, take more time, print things you are not sure about, call your function to test them.

yea you’re right. I think my fundamentals are not strong enough.

Yes, seems you could use more practice with python and data types

what do you mean by “but”?
Are they really different?

You wanted to find a data point with the smallest error. Did you find that?

I don’t think the problem is your knowledge of python, but rather about taking the time to reason about the problem itself.

I found the mistake :stuck_out_tongue: thanks ionatan. I need to slow down and understand word for word in the code I’m trying to write.

That wasn’t a mistake.

Or rather, there’s a different mistake, a bad decision, allowing for different results. But the getting of those different results isn’t a mistake.

floats are approximations, so if you want something exact, don’t use them, or if you do use them, don’t make decisions based on exact comparisons, or if you do, then treat the outcomes as equivalent

This is my answer:
def calculate_error(m, b, point):

x_point = point[0]
y_point = point[1]
point = [x_point, y_point]
y = (m * x_point) + b
difference = y - y_point
return abs(difference)

After looking at the solutions, I see that I was on the right track, but ultimately wrong!

Can some explain the first line of the solution and how I was supposed to know to unpack the tuple in this way?

Also, why is it that
x_point, y_point = point CORRECT ANSWER

point = x_point, y_point INCORRECT ANSWER

The later is more intuitive to me.