Question
In Pandas, what happens if there are empty or missing values in a CSV file and we try to read it?
Answer
If the CSV file contains missing values, then when we read the file, it will populate the missing cells with NaN
. NaN
is short of “Not a Number”, and used to signify missing values.
If needed, we can replace these NaN
values with an actual value, like 0
or an empty string ''
, using the fillna()
method. Or, we can drop any rows that contain an empty value, using dropna()
.
For example, if we had a CSV file containing the following:
name,flavor,topping
,chocolate,chocolate shavings
Birthday Cake,,gold sprinkles
The second row is missing a value in the first column, and the third row is missing a value in the second column.
When we read this file using Pandas read_csv
, it will load it like so, filling in the missing values with NaN
.
name flavor topping
NaN chocolate chocolate shavings
Birthday Cake NaN gold sprinkles