Cleaning Datasets with Python: Python Strings: Medical Insurance Project

Codecademy Exercise Link:
https://www.codecademy.com/journeys/data-scientist-aly/paths/dsalycj-22-data-science-foundations/tracks/dsalycj-22-python-fundamentals-for-data-science-part-ii/modules/dsf-python-strings-df5478c1-b7d4-4ede-b5df-191256f19859/projects/ds-python-strings-project

I am having trouble with cleaning this data set using python. I tried utilizing the correct answer that the exercise hints give me but it says record clean is not defined. i clearly defined that variable inside the for loop. Why is it not defined?

medical_data = \
"""Marina Allison   ,27   ,   31.1 , 
#7010.0   ;Markus Valdez   ,   30, 
22.4,   #4050.0 ;Connie Ballard ,43 
,   25.3 , #12060.0 ;Darnell Weber   
,   35   , 20.6   , #7500.0;
Sylvie Charles   ,22, 22.1 
,#3022.0   ;   Vinay Padilla,24,   
26.9 ,#4620.0 ;Meredith Santiago, 51   , 
29.3 ,#16330.0;   Andre Mccarty, 
19,22.7 , #2900.0 ; 
Lorena Hodson ,65, 33.1 , #19370.0; 
Isaac Vu ,34, 24.8,   #7045.0"""

# Add your code here

# print(medical_data)
updated_medical_data = medical_data.replace('#','$')

# vprint(updated_medical_data)
num_records = 0 

for i in updated_medical_data:
  if i == '$':
    num_records += 1
print(f" \n There are {num_records} medical records in the data.")

medical_data_split = updated_medical_data.split(";")
print(medical_data_split)

medical_records = [] 

medical_records_clean = []

for record in medical_records:
  record_clean = [] 
  for item in record:
    record_clean.append(item.strip())
medical_records_clean.append(record_clean)

Hi @strikeouts27

You are try to create a list ‘record_clean’ for every ‘record’ in the ‘medical_records’ list. At a first check ‘medical_records’ seems to be empty. Are you sure that is not suppose to be ‘medical_data_split’?

medical_data_split = updated_medical_data.split(";")
print(medical_data_split)

medical_records = [] 
medical_records_clean = []

for record in medical_data_split:
  record_clean = [] 
  for item in record:
    record_clean.append(item.strip())
medical_records_clean.append(record_clean)

Cleaning datasets in Python, particularly in projects like the Medical Insurance Project, involves using string manipulation techniques to prepare data for analysis. This process includes tasks like splitting strings into individual data points, removing unnecessary whitespace, and converting data types. By efficiently cleaning the dataset, you ensure accurate and reliable analyses, which are crucial in the healthcare domain.