 # US. Medical Insurance Costs: Portfolio Project (Simple Linear Regression issues)

Hello, everyone
Now I’m working on Python Portfolio Project. Aside from recommended tasks I tried to implement a simple linear regression analysis to figure out how BMI values are related to the values of insurance charges, where BMI is an independent variable (x) and insurance costs are dependent one (y).

First, I’ve tried to write a function to calculate the slope of the estimated regression line (𝑏₁ coefficient) based upon the following formula: Here is my code:

``````bmi = np.array(bmis).reshape(-1, 1)
charges = charges_array

#Function to find parameter b1
def find_b1(bmi, charges):
residuals_sum = 0
sum_squared = 0
for x in range(len(bmi)):
residual_x = bmi[x] - np.mean(bmi)
for y in range(len(charges)):
residual_y = charges[y] - np.mean(charges)
residuals_sum += (residual_x * residual_y)
sum_squared += np.square(residual_x)
b1 = residuals_sum / sum_squared
return b1

print(find_b1(bmi, charges))

#Output
[-9960.4426389]

#Check with numpy whether results match
from sklearn.linear_model import LinearRegression
model = LinearRegression().fit(bmi, charges)
b1 = model.coef_
b1

#Output
array([393.8730308])

``````

If you have any idea why the results differ and where I could have made a mistake, please give me a clue.