US Medical Insurance Project - Review appreciated!

Hello, and thank you for clicking on my post!

I found this project challenging but not impossible, and I spent 3 days on it (one or two hours a day). I feel like the analysis I tried to do in this project would be much easier once I learn more skills and how to use libraries like pandas.

Here is my project : https://github.com/eHemink/us_medical_insurance_costs/blob/main/us-medical-insurance-costs%20(1).ipynb

I’d greatly appreciate any comments and feedback!

Congrats on completing the project.

  • Good use of comments; it’s easy to follow along with your thought process during EDA.

  • I would print out the results of the smokers vs. non-smokers in the dataset (274 vs. 1064, respectively) so people have a baseline to go on.

  • You didn’t do any hypothesis testing, so I’d be careful about utilizing the word “correlation” here (with regards to BMI for example). It’s (BMI) also a controversial number and not an accurate measurement of one’s overall health b/c it doesn’t take in a number of other factors about an individual.

Overall, you have a solid grasp on writing functions to pull insights out of the data and it’s very clear to follow along.
Good work!