Interesting result difference at Reggie's Linear Regression project

Hello! I’ve encountered a result difference that intrigued me, and I’d like to know the technical reason of why this happens.

Over at Reggie’s Linear Regression part 2, the calculation returns a slight difference depending on how you build your possible_ms and possible_bs lists.

In my first try, I used the following code:

possible_ms = [(m/10) for m in range(-100, 101)]
possible_bs = [(b/10) for b in range(-200, 201, 1)]

When calculating the best_m, best_b and smallest_error, I got:
best_m = 0.4
best_b = 1.6
smallest_error = 5.0

Now, Part 3 immediately says I should’ve found m = 0.3, b = 1.7 and a total error of 5. My first thought was to compare my code with the solution code, which for a moment stupefied me, since my code was IDENTICAL to the solution code.

And then I saw the difference:

possible_ms = [(m*0.1) for m in range(-100, 101)]
possible_bs = [(b*0.1) for b in range(-200, 201)]

Notice the use of *0.1 instead of /10. In essence, these two operations should yield the same results. Mind you, I’m no mathematician, so I could be wrong there. However, when printing the m, b and error using print(f'Found smaller error of {error} at m = {best_m} and b = {best_b}.') each time the code would find a new smallest error revealed me something interesting.

The last two smallest errors would print the following when using /10:

Found smaller error of 5.000000000000001 at m = 0.3 and b = 1.7

and

Found smaller error of 5.0 at m = 0.4 and b = 1.6

However, if I use *0.1 instead, it’ll print:

Found smaller error of 4.999999999999999 at m = 0.30000000000000004 and b = 1.7000000000000002

I gather from this that Python is returning slightly different ms and bs when calculating them with m/10 or m*0.1 etc. I’m failing to see, however, why that is the case. Can anyone give me an explanation for that? I’ll put the whole code in a Codebyte next to make your life easier.

def get_y(m, b, x): return (m * x) + b def calculate_error(m, b, point): x_point = point[0] y_point = point[1] y_value = get_y(m, b, x_point) difference = y_value - y_point distance = abs(difference) return distance datapoints = [(1, 2), (2, 0), (3, 4), (4, 4), (5, 3)] def calculate_all_error(m, b, points): total_error = 0 for point in points: total_error += calculate_error(m, b, point) return total_error possible_ms = [(m * 0.1) for m in range(-100, 101, 1)] possible_bs = [(b * 0.1) for b in range(-200, 201, 1)] smallest_error = float('inf') best_m = 0 best_b = 0 for m in possible_ms: for b in possible_bs: error = calculate_all_error(m, b, datapoints) if error < smallest_error: smallest_error = error best_m = m best_b = b print(f'Found smaller error of {error} at m = {best_m} and b = {best_b}.') # printing to debug after I found the aberration print(best_m, best_b, smallest_error)

The last thing I’d like to know is: which result if more accurate? The ones using the lists calculated with /10 or the ones with *0.1?

Hello David! I ended up in the same issue you are mentioning here. Furthermore, I’ve seen that in the Jupyter text they use the * 0.1 approach, resulting in m = 0.3 and b = 1.7 as the best couple, while in the Codecademy learning environment they suggest the /10 approach, resulting in m = 0.4 and b = 1.6 as the best couple.

I guess the problem behind that is the one mentioned in this link concerning floating point resolution.

About you last question, one way to see which of the two methods is the most reliable could be just to calculate the y value from each x value of datapoints with both m=0.3, b = 1.7 and m=0.4, b = 1.6 and see the error between real_y and calculated_y (as suggested in the previous link, using “decimal” module makes you able to see numbers with higher precision):

from decimal import Decimal def get_y(m, b, x): return m*x + b datapoints = [(1, 2), (2, 0), (3, 4), (4, 4), (5, 3)] total_error_case1 = 0 # m = 0.3, b = 1.7 total_error_case2 = 0 # m = 0.4, b = 1.6 for i in range(len(datapoints)): total_error_case1 += abs(datapoints[i][1]-get_y(0.3,1.7,datapoints[i][0])) total_error_case2 += abs(datapoints[i][1]-get_y(0.4,1.6,datapoints[i][0])) print(Decimal.from_float(total_error_case1), Decimal.from_float(total_error_case2))

The total error seems slightly higher with m=0.3, b=1,7 (slightly is an euphemism here :smiley: ), thus (perhaps!) the “x/10” method may be more accurate.

Cheers
Alberto

Hello Alberto! Thank you for your reply.

I had a suspicion that the error was due to the way computers handle floating point numbers. But I couldn’t exactly pinpoint why. The link you provided enlightened me!

It seems then that since the computer cannot represent the decimal value 0.1 as it is, exactly 0.1, not its binary approximation of 0.1000000000000000055511151231257827021181583404541015625, the calculation turns out slightly off.

That is, dividing by 10 will result in a floating point number that is slightly off from the real result, but multiplying by 0.1 renders a slightly bigger error, since 0.1 itself is already represented inside the computer slightly erroneously: the error compounds.

In that sense, I believe that you must be correct in your assertion that the “x/10” is more accurate!

Thank you again for adding to the discussion
David

1 Like