Us medical insurance project review

This was a fun exercise. creating the graphs was definitely a challenge! i would like to learn more about the graphs.

1 Like

Congrats on completing the project.
It’s clear you understand how to write functions to sift through the data to gather insights. And, it’s great that you use comments to describe what your thought processes were when inspecting the csv file. That’s helpful to anyone not familiar with the data or who might not be as technically inclined.

For my own curiosity, I wondered if there was a reason why you didn’t use built in Pandas methods to get basic stats or use things like df.head() to see what the data looks like (basic EDA stuff).

Things like:

df.describe()
>>>	age	bmi	children	charges
count	1338.000000	1338.000000	1338.000000	1338.000000
mean	39.207025	30.663397	1.094918	13270.422265
std	14.049960	6.098187	1.205493	12110.011237
min	18.000000	15.960000	0.000000	1121.873900
25%	27.000000	26.296250	0.000000	4740.287150
50%	39.000000	30.400000	1.000000	9382.033000
75%	51.000000	34.693750	2.000000	16639.912515
max	64.000000	53.130000	5.000000	63770.42801


df['age'].mean()
>>39.2

df["sex"].value_counts()
>>male      676
female    662

df["region"].value_counts()
>>southeast    364
southwest    325
northwest    325
northeast    324

df['smoker'].value_counts()
>> no     1064
yes     274

etc.etc.

You can even break out the dataframe and analyze it further by using .iloc like this for smokers:

smokers = df.iloc[(df['smoker']=='yes').values]
smokers.head()

>>	age	sex	bmi	children	smoker	region	charges
0	19	female	27.90	0	yes	southwest	16884.9240
11	62	female	26.29	0	yes	southeast	27808.7251
14	27	male	42.13	0	yes	southeast	39611.7577
19	30	male	35.30	0	yes	southwest	36837.4670
23	34	female	31.92	1	yes	northeast	37701.8768

For plots you could use either matplotlib and seaborn or even plotly.
Like:

plt.figure(figsize= (8, 6))
sns.barplot(x="sex", y="charges", hue="region", data=smokers)
plt.show()

I would also recommend not printing out entire datasets b/c it kind of clutters up the notebook and people don’t generally like scrolling and scrolling through that.

Good work! Happy coding!

1 Like

thank you! this was super helpful!

1 Like