Here is my attempt at the Medical Insurance Project

Hey guys,

Here is my attempt. Let me know what you think. This is my first project here on CodeAcademy.

Thank you for posting your project! :partying_face:
Is there a link to where this project is located?

I’m assuming that this is on the DS path and by this point you’ve gone through Pandas, Matplotlib and Seaborn, or not? About how long did this take you to do and how do you feel about it now that you’ve completed it? Is there anything else you’d do with the data or other questions that you’d explore further? I’m just curious. I like to know what ppl think after completing projects. :slight_smile:

  • It looks like you understand the material and have a solid grasp on how to write functions and use Tabulate and Plotly to retrieve some observations in the data (which are new to me. [I’ve used Pandas, Seaborn, Matplotlib, NumPy and SciPy to navigate around data frames and to analyze the data]).
  • It’s good that you included brief explanations before the code cells so anyone reading it could understand your thought processes.
  • Also liked the brief intro and the inclusion of summary observations!

Other things to consider…
There are some pandas methods you could use to get basic info from the df, like
df.info()
df.columns
df.shape
df.isnull().sum()
df.['column name'].describe() (which gives you descriptive stats of a column)

You could also divide up the data into dataframes by region by using .iloc:

southwest = df.iloc[(df['region']=='southwest').values]
southwest.head(10)

ex:

southwest['bmi'].describe()
count    325.000000
mean      30.596615
std        5.691836
min       17.400000
25%       26.900000
50%       30.300000
75%       34.600000
max       47.600000
Name: bmi, dtype: float64

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.describe.html
and then make some basic Seaborn plots comparing regions based on smoking, bmi, costs, etc. The possibilities are pretty broad for EDA here.

Good work!! :dancer:t2:

Thanks for the input. I have made it through about 30% of the DS path so far and this was my first project. I would have liked to add some more graphs and do a little more detailed dive into the data. I would have probably done some more BMI/Age analysis. I know you mentioned a link to the project are you talking about within CodeAcademy? I am familiar with df.info() ect, but need to brush up on .loc for Pandas. Total time it probably took me 5 hours or so. Did my github pull up fine? I was having a little trouble with it at first.

Thanks!

1 Like

I meant where in the DS course is this project. I’m a bit fuzzy on where it is. :woman_facepalming:t2:
Yep, all looks good on GitHub!