# US Medical Insurance Cost Project - Please review my project!

This was my first ever project in Data Science! I’m trying to change my career and was pursuing Javascript full-stack pretty heavily, but more recently I realized that Data Science is a better fit for my personality type. I enjoyed the coding, but the documentation and markup was new to me. I think that as I get further along in this Codecademy Path I will learn more about how the analysis should go, and I can return to this project for improvements.
This project took me 2 days, one for the coding, and one for the documentation.
Thanks in advance for reviewing my project!

Congrats on completing the project.

• You’re adept at writing functions to extract insights from the dataset.

• Good use of comments describing the results from the functions.

• One thing I’ll say is that the mean of charges will be skewed b/c of outliers. So, perhaps a better stat to use is median.

Ex:

``````df['charges'].mean().round(2)
>> 13270.42

vs:

df['charges'].median().round(2)
>9382.03

Or,
df.describe()
>>.          age	     bmi	  children	    charges
count	1338.000000	1338.000000	1338.000000	1338.000000
mean	39.207025	30.663397	1.094918	13270.422265
std	14.049960	6.098187	1.205493	12110.011237
min	18.000000	15.960000	0.000000	1121.873900
25%	27.000000	26.296250	0.000000	4740.287150
50%	39.000000	30.400000	1.000000	9382.033000
75%	51.000000	34.693750	2.000000	16639.912515
max	64.000000	53.130000	5.000000	63770.428010

df[["age", "sex", 'charges']].groupby('sex').median().round(2)

age	  charges
sex
female	40.0	9412.96
male	39.0	9369.61

df[["age", "sex", 'charges']].groupby('sex').mean().round(2)

age	charges
sex
female	39.50	12569.58
male	38.92	13956.75
``````

Same goes for smokers, slightly smaller difference:

``````df[["smoker", 'charges']].groupby('smoker').median().round(2)

charges
smoker
no	   7345.41
yes	   34456.35

vs:
df[["smoker", 'charges']].groupby('smoker').mean().round(2)
charges
smoker
no	8434.27
yes	32050.23

``````

Keep up the good work!

1 Like