U.S. Medical Insurance Cost feedback

Dear peers,
I just finished my first portfolio project on CodeCademy and I would like to share with you the results of my work. I have it published on github (let me know if you can access it):

https://github.com/andzej535/python_portfolio_project.git

As I already have some experience with Python and pandas library I decided to it to analyze the dataset. Project took me around two days per 8h working to finish (not continuous work).

First, I defined some questions that could help me to approach the task.
Second, I analyzed the dataset and put several histograms/pie charts to explore the columns.
Then I tried to answer these questions with the data, producing figures where possible.

I found this project quite easy and the most difficult part was to actually scope the project. After doing brief analysis I found out that some of the questions I initially posed were not so interesting.

I consider this version as the first iteration and I will gladly get some feedback from you :partying_face:

If you are interested on how I approached some of the questions let me know!

Cheers,
Jakub

Hi @andz3j,

I think your repo is currently set to private. They don’t seem to be accessible through your page.

1 Like

Thank you for your information! I made visibility public so if you find a moment you can check it again :slight_smile:

Apologies for not being able to view this sooner. Congratulations on finishing up. You seem to have some reasonable conclusions and the data is presented in a neat way.

For a bit of feedback: You have some interesting analysis in code comments. It may be worth adding the most interesting points into more standard text as opposed to comments in the code and reiterating the important ones in the conclusion. It’s very easy to overlook coding comments.

For a rough idea of a summary at the start (your introduction) you’d be asking a question and at the end you’d be answering that question.
So what are the impactful dependencies of this dataset? Does age affect charge etc. Note any relations that you find (ideally with statistics or a graph to back up the claim) and also note any interesting non-existent dependencies. If BMI for example does not affect charge then it’s worth saying that too.

There are a few places where you use very loose estimates for some of your analysis (e.g. ~$4000). Whilst you may want to describe it that way in summary there’s no reason not to show an actual calculation either, at least in the analysis section and ideally with errors to back up the statements. If you concluded that something varies by $4000 and could point back your analysis where the value is actually calculated e.g. $4000 +/- 600 then you’d be backing up your own claims.

Thanks for you feedback!
I introduced some changes in the Jupyter notebook:

  • gathered observations in one cell
  • more precise numerical results
  • added analysis of BMI
  • updated conclusions
    Project was really fun to do. Initially I ignored the BMI analysis, but after it I found that over half of the group is obese (not so healthy people)!

Data science is really interesting and rewarding field ;>

I am Judy Ping McCormick from Melbourne. Started learning coding 45 days ago.

When I first started attacking this US medical insurance costs project, it was quite daunting. But after I finished, I was fired up, and kept on doing more analysis, eg different BMI for female smokers and non_smokers, to see if it is a myth that smoking makes you slim.

However, I do have challenges with ‘class’, would like to create class for male, female, smoker and non, and different region classes, need feedback/help on this frontier. Below is my work, like to hear your opinion.

Hello Judy,
I went through your project and I see you found similar results as me, especially that the main contributor to insurance policy cost is (non)smoking. What you could look into is how the age is influencing the policy cost.

Regarding the class implementation, you need ask yourself what kind of methods you will need for them. I can tell that the most importantly you want to aggregate results in groups. Defining attributes of class would be fairly easy as you could do that in a cell where you read csv file - instead of appending records to list you could use that for loop to instantiate objects.

Making such detailed classess as you intend might be difficult as you have little data - instead it would be wiser to focus on class ‘Person’. But again, you need to look into your needs, especially what kind of methods could help you analyze data more efficiently. As I saw your project, you get to fairly good observations and conclusions based on the standard concepts like lists and loops :slight_smile:

To wrap up, before jumping to defining class check where you could implement functions (what blocks of code are repetitive?).

On the technical note, I advise you to check the style of your text as there are some spelling mistakes. Also, it would help the reader if you gathered your observations at the end of the text, in one place, followed by conclusions.

However Judy, what left me a bit bewildered is that you replied to my post sharing your personal work, asking for feedback but no providing any comments on my work. I consider it quite rude :frowning: I don’t doubt your best intentions and I needed to speak my mind :slight_smile:

Happy coding!

Oh Sorry! I am still new to this and was trying to post my own post asking for feedback on my project. Still learning my way around Codecademy’s forums. Sincere apologies.

by the way , it was my first response from codecademy, and was from you calling me ‘rude’, quite funny in a way. Life is too short to be serious. Thank you for the feedback.

Yeah, my bad I got too upthight :< sincere apologies for my reaction! I completely ignored fact you might not be familiar with the forum and stuff. Despite the fact I already gave my impression as a stiff dude, I still hope you can use my feedback and come back with more questions whenever you need, I will be happy to help!

Thank you. You are my first and only buddy at the World of coding. I only started 45 days ago. Thank you for being frank.

1 Like

And how do you find learning coding until now? :slight_smile:

Thank you for asking. Really enjoying it. Just finished “Chocolate Scraping” Project. I am completing “Data Science”, trying to start a new career. How about you?

Same here, I try to finish Data Science course. I have background in Python and pandas already, thus I am mostly aiming to learn something about machine learning algorithms

Excellent. Machine learning is a job for the future.