This was a fun exercise. creating the graphs was definitely a challenge! i would like to learn more about the graphs.
Congrats on completing the project.
It’s clear you understand how to write functions to sift through the data to gather insights. And, it’s great that you use comments to describe what your thought processes were when inspecting the csv file. That’s helpful to anyone not familiar with the data or who might not be as technically inclined.
For my own curiosity, I wondered if there was a reason why you didn’t use built in Pandas methods to get basic stats or use things like df.head()
to see what the data looks like (basic EDA stuff).
Things like:
df.describe()
>>> age bmi children charges
count 1338.000000 1338.000000 1338.000000 1338.000000
mean 39.207025 30.663397 1.094918 13270.422265
std 14.049960 6.098187 1.205493 12110.011237
min 18.000000 15.960000 0.000000 1121.873900
25% 27.000000 26.296250 0.000000 4740.287150
50% 39.000000 30.400000 1.000000 9382.033000
75% 51.000000 34.693750 2.000000 16639.912515
max 64.000000 53.130000 5.000000 63770.42801
df['age'].mean()
>>39.2
df["sex"].value_counts()
>>male 676
female 662
df["region"].value_counts()
>>southeast 364
southwest 325
northwest 325
northeast 324
df['smoker'].value_counts()
>> no 1064
yes 274
etc.etc.
You can even break out the dataframe and analyze it further by using .iloc
like this for smokers:
smokers = df.iloc[(df['smoker']=='yes').values]
smokers.head()
>> age sex bmi children smoker region charges
0 19 female 27.90 0 yes southwest 16884.9240
11 62 female 26.29 0 yes southeast 27808.7251
14 27 male 42.13 0 yes southeast 39611.7577
19 30 male 35.30 0 yes southwest 36837.4670
23 34 female 31.92 1 yes northeast 37701.8768
For plots you could use either matplotlib and seaborn or even plotly.
Like:
plt.figure(figsize= (8, 6))
sns.barplot(x="sex", y="charges", hue="region", data=smokers)
plt.show()
I would also recommend not printing out entire datasets b/c it kind of clutters up the notebook and people don’t generally like scrolling and scrolling through that.
Good work! Happy coding!
thank you! this was super helpful!