Here’s a link to my Jupyter notebook for the Python Medical Insurance project. As I have no experience with statistics or Python data analysis packages like pandas and NumPy, the analysis in here is pretty basic. My twist is that it’s all done by geographic region, in order to see if there are any significant disparities in various attributes between regions (there are not, as it turns out).
I’m lying a little about not using any data analysis packages - I implemented a very simple correlation coefficient between age and insurance costs by following some directions found here. Full disclosure: a lot of it still went over my head, and I’m counting on the material later in the Data Science career path to give me a grounding in quantitative analysis and statistics. Hopefully I won’t be disappointed.
Anyway, if anyone happens to look at this, comments are welcome. It’s not much to look at, but you might find something you like. I’d say this represents 5 hours of work or so.