About the Portfolio Project: U.S. Medical Insurance Costs category

Wow, I like the way you organize each patient’s data into a dict. And, that makes me wanna re-think again my coding on the process how to handle the data from the original CSV file! Thanks! :green_heart:

Cool, @chi8.
Thanks for the feedback and let me know if I can help you with any project!

1 Like

Hello! I have just completed the U.S. Medical Insurance Costs Project and have submitted my analysis here for some review. The project was very interisting to make, but some of the parts are made a litlle bit more compliceted then it could be. Simple solutions didn’t work for me, so I used the one that works. I hope that at the end of the course I will come back to this project to make it lighter. If you already have any tips, let me know.
Here is my solution for this project GitHub - MayaViko/medical_insurance: Portfolio project

Dear coders : )

I’ve spent a lot of time on this project and quite enjoyed the process. Not only the coding part, but also the SCOPING of the project, and the exploring of what I want to know from this data and what bias there might be.

Below are my project Goals:

  • Find out the average age of the patients in the dataset.
  • Analyze where a majority of the individuals are from.
  • Figure out what the average age is for someone who has at least one child.
  • Find out what might cause higher charges of insurance cost.
  • Look at the different costs in each category, including sex, children status, smoking status, regions, and age groups
  • By counting the ratio of target patients in certain category to see whether there might be bias in the dataset

Wouldb be very appreciated for code review! Thank you! :grinning:
Here is my project link via Github


I’ve just had a look of your code. :slight_smile:
I think for this part of the code, after the else statement,
count_in_regions[i] should probably equals 1, instead of 0. : )

Maybe you could check by adding up the numbers of patients from each region, and compare the number with the total of the patients.

#Number of individuals in each region

def number_of_patients_in_locations(region_list):
    count_in_regions = {}
    for i in region_list:
        if i in count_in_regions:
            count_in_regions [i] += 1
            count_in_regions[i] = 0
    return count_in_regions

pations_in_location = number_of_patients_in_locations(region)

Happy coding!

hello, my name is Basel, not like the plant (Basil).

Anyway, I know that this is not the right place for such a question but I am still confused as to how to reach out to other people to team up and do the project.

Can anybody elaborate, please?


Just finished my portfolio project on U.S Medical Insurance Costs. Check it out and feel free to let me what I can do to improve and further enhance the code. Click here → U.S Medical Insurance Costs - DG

Hello everyone, here is my first try at my project. I know it looks a bit basic compared to what others produced, but this is the best I could do for now at least. I might come back and add a few things to it in the future. Looking forward to your feedback. Many thanks.

Hi all, here is my first attempt at this project. Appreciate any feedback you can offer!


Dear coders ,

I 've really found interesting to analyze this dataset using the Python skills that I have developed within this course. The main tasks have been performed. Here is my project link via Github. Please feel free to give feedback or comments. Thanks in advance.

Hello everyone, this is my first project ever : GitHub - Rafik-Sebia/U.S.-Medical-Insurance-Costs-Project: For this project, we will be investigating a medical insurance costs dataset in a .csv file using fundamental Python skills.
I know it’s not as fancy as those who used libraries like pandas and matplotlib, but I didn’t learn them yet, so if someone has some feedback, it would be nice, thank you

Hello all, below is a link to my GitHub repository for the U.S. Medical Insurance Project. I used Jupyter Notebook to complete the project and uploaded the files to GitHub. The project took me a few days, mainly trying to get my feet wet and understand how GitHub works. Any feedback is welcome! The code ran successfully, but I am open to criticism. Thanks, and be well.


Hello all,

Below a link to my Github with my attempt at the US Medical Insurance Portfolio Project. I tested every piece of code and it works, but positive and constructive feedback are more than welcome.

Thanks in advance!

hello all, i hope everyone is doing well, this is my attempt, i am pretty sure there are areas where i need to improve so your feedback would be much appreciated

Hello guys, this is my 2nd take on this project, after learning Pandas and Matplotlib : GitHub - Rafik-Sebia/2nd-take-U.S.-Medical-Insurance-Project: The second take on the US Medical Insurance Portfolio Project

any feedback would be nice, thanks

It has taken me a long while to finally share my solution to the project. Finally, I overcame that fear and I am putting it out for everyone to see. I am open to feedbacks and criticism.

1 Like

here is my code. interesting how female non-smokers pay more than male non-smokers…
on the surface it looks like the average insurance cost is higher for men than women, but that’s weighted by the fact more men are smokers and therefore pay a premium for that. however, this average conceals the fact that non-smoking women will pay more than non-smoking men.

1 Like

I am new to Codecademy, joined about 2 months ago ))
I have just completed my first project off platform within Data Scientist course path. You may have a look at it and share some advice if you’d like on how to improve on it.

Hello ! I need some help with a part of my code. Tried to go a lil further in the analysis and ended up stuck. It might be something simple that is just under my nose but I can’t find it.
I have this function that I wanna passa a list with four regions and a list with dictionaries containing some records. The function should return a new dictionary with four keys (one for each region) and append the records’ insurance costs to the respective region. Although, when using the function, I get the exact same costs for all four regions.
I know there are easier ways of doing this. I just wonder why this way does not work.
Appreciate if someone can give me a hand on this one.

# I reduced the size of the records list for the example sake records = [{'age': '31', 'sex': 'female', 'bmi': '25.74', 'children': '0', 'smoker': 'no', 'region': 'southeast', 'charges': '3756.6216'}, {'age': '46', 'sex': 'female', 'bmi': '33.44', 'children': '1', 'smoker': 'no', 'region': 'southeast', 'charges': '8240.5896'}, {'age': '37', 'sex': 'female', 'bmi': '27.74', 'children': '3', 'smoker': 'no', 'region': 'northwest', 'charges': '7281.5056'}, {'age': '60', 'sex': 'female', 'bmi': '25.84', 'children': '0', 'smoker': 'no', 'region': 'northwest', 'charges': '28923.13692'}, {'age': '56', 'sex': 'female', 'bmi': '39.82', 'children': '0', 'smoker': 'no', 'region': 'southeast', 'charges': '11090.7178'}] regions = ['northeast', 'northwest', 'southeast', 'southwest'] def charges_regions (regions, records): charges_per_region = dict.fromkeys(regions,[]) for record in records: for region in charges_per_region.keys(): if record["region"] == region: charges_per_region[region].append(float(record["charges"])) break return charges_per_region print(charges_regions (regions, records))

Hello Everyone.,
Not sure if this is where I should post this, but I would appreciate a review of my code. I will attach the accompanying csv as well, although I am pretty sure most of you have a copy already.

I look forward to what you have to say,