Data Science Path, finished Python fundamentals - Group Project: U.S. Medical Insurance Costs

Hi there,

I am currently working on the ‘US Medical Insurance Cost Portfolio Project’ after finishing Python Fundamentals within the Data Science path.

Unfortunately I am stuck and would appreciate some advice.

I am working on the project on my own since I guess I wouldn’t be a good group member at the moment due to time restrictions. But, yes, I do appreciate the promotion of group work in the course in general.

While working on the project I started concentrating on error checking the given csv file. Since there are no ‘errors’ in the file, I added some records with missing or 'inadequate data. For example records with empty fields or characters in fields where there should be numbers (‘xxx’ in age for example) - see attached screenshot.


I tried to extract these records afterwards with the python instruments given so far. Gave me some headaches, but I am almost through.

I created the individual lists for the csv columns and combined them to one nested list. I also created a dictionary of the data ( with censecutive numbers as keys). I extracted the records containing empty fields or ‘inadequate’ data.
I checked age and bmi whether data can be converted into int and float and converted the data.

Still there is one issue:
I still find a few records in my data_list_revised_2 where age and bmi data still are strings.

(see my jupyter notebook at [GitHub - njemba/Group-Project-U.S.-Medical-Insurance-Costs]
lists on index 2, 10 and 18 in Output window of coding block starting with:

# extracts all records with fields containing improper data and appends these to a seperate list
def data_list_revised_wrongs():

Has anybody an idea what the reason might be?
I would highly appreciate some feedback,


Good Morning,
meanwhile I identified the mistake I made myself. Thanks to all.