Hello there fellow coders
I have finished my first Portfolio Project Medical Insurance and was able to upload it on GitHub
Overall, it was a fun project, I think it was nice that they allowed us to choose our own different goals, and then writing the code to achieve said goals, so I would say the difficulty was “just right” and it makes reading other people’s projects much more fun, because the projects are not all the same, and different people have different approaches and ideas
The project took around 3 hours in one setting, I finished I believe a week ago, but I’m only posting it now, I was having some exams at university, so I forgot to post it, and I didn’t know how to upload it on Github in the first place, so here is the link
Congrats on completing the project. You’re right: it’s fun to see different peoples’ approaches and ideas. This version is a perfect example. So, kudos on that!
Just a few thoughts:
Maybe consider adding some basic EDA stuff at the top of the notebook, an introduction to the data for the reader/viewer: what variables (columns) are involved, number of records, breakdown of men (676) and women (662), people in each region, smokers vs. non, etc. Just so anyone reading this notebook will have a general idea about the dataset. Then you can dive into it.
I like the granular exploration/breakdown of smokers. But, it might make more sense to show the basic numbers breakdown first: smokers: 1338 total people in the dataset, 274, non-smokers: 1064, of that, there are 159 men who smoke vs. 115 women). Stuff like that.
I’m confused about this function: def region_highest_BMI(name): and the results, “(12175, ‘southeast’)”
Did you mean the region with the highest avg. BMI?
I think it would be great if you added some conclusions at the end, since you stated your initial questions at the top of the notebook.
Hey thanks a lot for the nice feedback
Yeah I recognize that I need to start adding some #comments explaining what each function is doing, and maybe how it works too, documentation is very important for software engineers and I should get used to explaining the steps I take
I didn’t think of finding out the actual number of candidates in the dataset and checking if it was balanced, but I see other coders who did that and it is a very important step that I left out, so thank you for pointing it out! and thank also for writing the actual numbers for me.
As for the function you are asking about, I had the idea of finding out which region had the highest BMI on average, but all I did was just find out which region has the highest BMI in general, which is not helpful at all, so I’ll need to improve this function later
Thank you really for the nice feedback, looking forward to seeing your project!