Does anyone know where the dataset used in US Medical Insurance Costs was taken from?

I would like background information on it as well.

I thought it came from Kaggle.?


One thing about it though that’s always bothered me is the fact that bmi is included in the data. It’s not an accurate indicator of one’s health at all and many feel it can be discriminatory. Couple that, with peoples’ inherent biases and that leads to assumptions about the data from the people in the data set.

I wasn’t aware of bmi’s inaccuracy! Also, bmi was the only thing I couldn’t make any viable conclusions with. Which is funny, there seems to be a bias that people who weigh more, pay more in terms of insurance.

Yea, unfortunately. BMI is not an accurate indicator of one’s health. It’s weight divided by height. Muscle weighs more than fat. A simple google search will pull up articles from the medical community that debate the number.
Insurance costs in the U.S. are also based on median income of zip codes. (It’s a racket, really :frowning: ). The cost of a procedure in one zip code will be different in another…

I thought I recently read somewhere that a BMI doesn’t give an accurate picture on individual level, but it might be useful to extract some information on a large(r) population.

If you have a fictive country, and the BMI is really high, then surely there is overweight in the population as a whole.

Having data and interpreting the data correctly are two different things.

I’m saying that through medical articles that I’ve read and hearing from doctors that it’s not an accurate measure of body fat, but a biased one. People are automatically written off as unhealthy when the opposite could be true. It shouldn’t be part of the insurance costs calculation But, apparently that’s a difficult thing to change…(on a societal level).

But, now we’re straying away from the original question. So, apologies.

but then you are measuring on an individual level, for which BMI is no good. Then you are not interpreting the data correctly.

little bit, sorry.

If you’re interested in further researching health insurance (coverage & related issues) in the U.S. I highly recommend taking a look at the ACS (American Community Survey) from the U.S. Census:
ACS Health Insurance tables

You can download files by years in one-year and five-year estimates. You can also look at coverage and income/poverty levels. It does require some cleaning/reorganizing if you want to save as a csv file and load it into Jupyter or Colab.

What is the ACS?

Also, another excellent resource for health-related data (and analysis) is from KFF (Kaiser Family Foundation): here.

