I am currently working on the ‘US Medical Insurance Cost Portfolio Project’ after finishing Python Fundamentals within the Data Science path.
Unfortunately I am stuck and would appreciate some advice.
I am working on the project on my own since I guess I wouldn’t be a good group member at the moment due to time restrictions. But, yes, I do appreciate the promotion of group work in the course in general.
While working on the project I started concentrating on error checking the given csv file. Since there are no ‘errors’ in the file, I added some records with missing or 'inadequate data. For example records with empty fields or characters in fields where there should be numbers (‘xxx’ in age for example) - see attached screenshot.
I tried to extract these records afterwards with the python instruments given so far. Gave me some headaches, but I am almost through.
I created the individual lists for the csv columns and combined them to one nested list. I also created a dictionary of the data ( with censecutive numbers as keys). I extracted the records containing empty fields or ‘inadequate’ data.
I checked age and bmi whether data can be converted into int and float and converted the data.
Still there is one issue:
I still find a few records in my data_list_revised_2 where age and bmi data still are strings.
(see my jupyter notebook at [GitHub - njemba/Group-Project-U.S.-Medical-Insurance-Costs]
lists on index 2, 10 and 18 in Output window of coding block starting with:
# extracts all records with fields containing improper data and appends these to a seperate list
Has anybody an idea what the reason might be?
I would highly appreciate some feedback,