Hello! I’ve encountered a result difference that intrigued me, and I’d like to know the technical reason of why this happens.
Over at Reggie’s Linear Regression part 2, the calculation returns a slight difference depending on how you build your possible_ms and possible_bs lists.
In my first try, I used the following code:
possible_ms = [(m/10) for m in range(-100, 101)]
possible_bs = [(b/10) for b in range(-200, 201, 1)]
When calculating the best_m, best_b and smallest_error, I got:
best_m = 0.4
best_b = 1.6
smallest_error = 5.0
Now, Part 3 immediately says I should’ve found m = 0.3, b = 1.7 and a total error of 5. My first thought was to compare my code with the solution code, which for a moment stupefied me, since my code was IDENTICAL to the solution code.
And then I saw the difference:
possible_ms = [(m*0.1) for m in range(-100, 101)]
possible_bs = [(b*0.1) for b in range(-200, 201)]
Notice the use of *0.1
instead of /10
. In essence, these two operations should yield the same results. Mind you, I’m no mathematician, so I could be wrong there. However, when printing the m, b and error using print(f'Found smaller error of {error} at m = {best_m} and b = {best_b}.')
each time the code would find a new smallest error revealed me something interesting.
The last two smallest errors would print the following when using /10
:
Found smaller error of 5.000000000000001 at m = 0.3 and b = 1.7
and
Found smaller error of 5.0 at m = 0.4 and b = 1.6
However, if I use *0.1
instead, it’ll print:
Found smaller error of 4.999999999999999 at m = 0.30000000000000004 and b = 1.7000000000000002
I gather from this that Python is returning slightly different ms and bs when calculating them with m/10
or m*0.1
etc. I’m failing to see, however, why that is the case. Can anyone give me an explanation for that? I’ll put the whole code in a Codebyte next to make your life easier.
The last thing I’d like to know is: which result if more accurate? The ones using the lists calculated with /10
or the ones with *0.1
?