Reviwe for my port-folio project Medical Insurance

I’ll share my project.

Congrats on completing the project.

Some things to consider:

  • Add a brief intro at the top of the notebook, including: data source, some questions you might initially have about the data, etc. You might also want to include a readme file for the repo as well.

  • Rather than look at the average or mean of the costs, look at the median. If you look at the min and max, you will see that there are outliers in the dataset that pull the mean.
    ex:

df['charges'].describe().round(2)
>>>
count     1338.00
mean     13270.42
std      12110.01
min       1121.87
25%       4740.29
50%       9382.03
75%      16639.91
max      63770.43
  • You cannot conclude that there is a correlation between smokers’ cost of insurance and region of the country. That would imply that you did statistical significance testing, which you did not.

You did a good job with the technical part/functions, etc., & with the comments & notes while you were sifting through the data in your analysis, which contributes to telling the story of the data. You’ll revisit this data a few more times in the course as you learn more about Python & libraries used for data science/analysis.