First portfolio project: U.S. Medical Insurance - feedback welcome

Ok, so as the title says, this is my first portfolio project. I’m doing the Data Science track and have not started learning about Pandas etc yet, so there’s no fancy stuff or visualisations here.

https://github.com/newsocialsandra/us-medical-costs/blob/main/us-medical-insurance-costs.ipynb

This was a super challenging project for me :sweat_smile:

I haven’t written any python code for a while and my previous knowledge of statistics/analytics is from college, more than 10 years ago. So I’ve been feeling a lot like that old ”I don’t know what I’m doing” dog meme.

Do you have any suggestions for me? I have a couple of small things I would like to improve in my code, noted far down in the notebook.

But I’m also thinking: could I have used classes for this, instead of just functions? Would that have been better? Why?

And is there anything else I’m missing here?

2 Likes

I like how you’ve done the hypothesis and project goals sections.

One suggestion I would make is that it would have been interesting to see the averages for the dataset as a whole as well as by region. You could then compare each regions to the national average. Also, like you mentioned at the end, formatting the numbers and maybe adding currency symbols would make it easier to read.

Thanks for your feedback, I really appreciate it! I think it’s a great suggestion to include the averages for the dataset as a whole, I’ll have a look at it.

Nice, I really liked your bmi & smoker per region analysis and drawn insight as to potentially why the insurance cost is higher. Definitely something I wished I drawn upon doing my own analysis!

On top of it I like how each question is followed by the code and clearly answered at the end.

Great work!

Thanks, I’m happy to hear you liked it!