I started the Data Science Path because I want to get some programming experience while I aim for a role in the IT sector. I spent the past week working on this project and figuring out where I wanted to take it. I definitely feel like I could have done more, but I didn’t want to spend too long working on this. I would greatly appreciate seeing any insight that you may have! I’m also very new to GitHub, so please let me know if you see something that doesn’t look quite right.
Congratulations on completing the project. You put a lot of work into this.
A few thoughts:
it might be a good idea to comment out the print() so people don’t have to look at a wall of text (like you mentioned).
When you look at ages not represented in the data: Because insurance is tied to employment (not including Medicare or ACA) and most kids are on peoples’ insurance plans until age ~24, so, you’re not going to see someone who’s under 18 and has insurance.
it’s good that you used a histogram to visualize the charges–b/c you’ll see that there are outliers and that would influence the mean. So, it’s better to look at the median.
When you say, ‘People in the Southeast tend to pay more. But is that down to the people who live there or is the insurance cost putting them at a disadvantage?’ You’re only showing the numbers of people who are represented from each region, not what they’re paying. (Unless you left something out?)
Ex: if you look at the median charges paid by region in the data, ppl in the dataset in the NE pay the most: