I would like to share my bit on US Medical Insurance Cost project.
It is my first stand alone data analysis project.
I was interested to find the influence of particular region on the insurance cost. In my analysis I tried to find the least expensive and the most expensive regions in terms of the medical costs.
I found the project quite interesting, although it would be greate to have a better description of the data set. For example, it is still not clear to me, what the charges are - is it the money the person paid for the insurance or the actua medical costs the insurance company have paid for the person medical services to the hospitals. What do you think?
The results, which I have got from the analysis are:
For the non-smoking people the result of the analysis is quite conclusive:
The non-smoking person with the given age, bmi and number of children will have the lowest insurance charges in Southeast. Although the difference between Southeast and Southwest is not that big and in fact difficult to establish because of the skew in Southeast, due to much higher average bmi. The most expensive region will be the Northwest with 14% higher charges in average.
For the smoking people the results are not decisive, because of the bigger influence of the bmi values. It looks like the same average bmi skew in the South regions combined with the smoking status drives the regional prices there and it is not feasible to quantitavely describe that influence with the given data set.
I would like to hear any comments or feedback. Let me know if you found the similar or different patterns and conclusions.