What happens if there are empty or missing values in a CSV file?


#1

Question

In Pandas, what happens if there are empty or missing values in a CSV file and we try to read it?

Answer

If the CSV file contains missing values, then when we read the file, it will populate the missing cells with NaN. NaN is short of “Not a Number”, and used to signify missing values.

If needed, we can replace these NaN values with an actual value, like 0 or an empty string '', using the fillna() method. Or, we can drop any rows that contain an empty value, using dropna().

For example, if we had a CSV file containing the following:

name,flavor,topping
,chocolate,chocolate shavings
Birthday Cake,,gold sprinkles

The second row is missing a value in the first column, and the third row is missing a value in the second column.

When we read this file using Pandas read_csv, it will load it like so, filling in the missing values with NaN.

name             flavor        topping
NaN              chocolate     chocolate shavings
Birthday Cake    NaN           gold sprinkles