Hello Everyone! I would really appreciate a code review of my us medical project. Any criticism is welcome. Thank you!!
Congrats on completing the project.
Some thoughts-
- Rather than jumping right into the functions, it might be a good idea to run some basic descriptive stats on the data first (read up on EDA-exploratory data analysis). That way, anyone viewing the project can get a general idea of exactly what’s in the data, then you can go about analyzing differences in the two populations you selected from the data.
- Stuff like, how many smokers vs. non-smokers are in the data? How many men vs. women? What’s the median charge and how is that different from the mean? The mean is pretty high, so that might be an indication of outliers in the data, which would pull the mean.
All of these built in methods are available in Pandas.
https://pandas.pydata.org/docs/user_guide/index.html#user-guide
Ex:
df.describe()
>> age bmi children charges
count 1338.000000 1338.000000 1338.000000 1338.000000
mean 39.207025 30.663397 1.094918 13270.422265
std 14.049960 6.098187 1.205493 12110.011237
min 18.000000 15.960000 0.000000 1121.873900
25% 27.000000 26.296250 0.000000 4740.287150
50% 39.000000 30.400000 1.000000 9382.033000
75% 51.000000 34.693750 2.000000 16639.912515
max 64.000000 53.130000 5.000000 63770.428010
or,
df['age'].mean()
>39.2
df['charges'].mean().round(2)
>>13270.42
df['charges'].median().round(2)
>> 9382.03
df["sex"].value_counts()
>male 676
female 662
df["smoker"].value_counts()
>>no 1064
yes 274
etc. etc.
- Use text cells to lay out your thought process for the reader while you comb through the data. You’re ultimately telling a story here with the analysis.
Also, just a suggestion for the readme file–you don’t need to mention how long the project took you as that’s irrelevant to anyone viewing it. I would also avoid mentioning CGPT, as who knows the accuracy of the results, or what’s actually in the training data in the model.
1 Like
Thank you @lisalisaj I am looking into EDA I see what you mean! Hopefully I will get a better understanding of it and use it properly to evaluate my code. I look forward to another review from you. Thank you for your help!!
1 Like