What happens if there are empty or missing values in a CSV file?

Question

In Pandas, what happens if there are empty or missing values in a CSV file and we try to read it?

Answer

If the CSV file contains missing values, then when we read the file, it will populate the missing cells with NaN. NaN is short of “Not a Number”, and used to signify missing values.

If needed, we can replace these NaN values with an actual value, like 0 or an empty string '', using the fillna() method. Or, we can drop any rows that contain an empty value, using dropna().

For example, if we had a CSV file containing the following:

name,flavor,topping
,chocolate,chocolate shavings
Birthday Cake,,gold sprinkles

The second row is missing a value in the first column, and the third row is missing a value in the second column.

When we read this file using Pandas read_csv, it will load it like so, filling in the missing values with NaN.

name             flavor        topping
NaN              chocolate     chocolate shavings
Birthday Cake    NaN           gold sprinkles
7 Likes

I am wanting to begin applying my lessons starting at this point in the curriculum. I have saved a file to my hard drive “11 11 CSV file.csv”. I placed in on the desk for simplicity’s purpose.

I tried to call it in both Codecademy’s IDE as well as the one downloaded from python.

In addition to the line of code: “pd.read_csv(‘11 11 CSV file.csv’)” what else do I need to do to begin working with this data?

I am operating on a Mac. If there was a recommended IDE for Mac I would be curious about your suggestion for this as well.

3 Likes

i would have to guess that the csv file you want to use has to be in the same directory as you are working. Otherwise it will not be able to find it.

2 Likes

Personally, I use Visual Studio Code IDE and would certainly recommend it.

After importing pandas and creating the csv file, you can call the file from basically anywhere. Just specify the path in the argument passed.

pd.read_csv('C:\\Users\\Akashdeep\\Desktop\\Test\\test.csv')

Here what I have done is create e folder named Test in my Desktop and inside it my test.csv file is there.

5 Likes

I was also interested in a case when there were not enough commas and tried the following CSV file:

name,flavor,topping
chocolate,chocolate shavings
Birthday Cake,gold sprinkles

The result seems to have NaN added to the end of the rows:

name                 flavor                topping
chocolate            chocolate shavings    NaN
Birthday Cake        gold sprinkles        NaN
2 Likes

What is the difference between “.read_csv()” and “.to_csv()” ?

“.to_csv()” saves your dataframe as a .csv file. “.read_csv()” reads an existing file for use with your Python session.