What information is returned from df.info()?

Question

What information is returned when we run df.info()?

Answer

When using .info() in Pandas, it will return information about the dataframe including the data types of each column and memory usage of the entire data.

To see this better, let’s observe the information being printed out.
First, it will print out the data type of the object, which is a Pandas dataframe.
<class 'pandas.core.frame.DataFrame'>

Next, it will display the number of entries, or rows, in the dataframe, and the range of values of the row indexes. Here the first index is 0 and the last index is 219.
RangeIndex: 220 entries, 0 to 219

After this, it prints out the information regarding each data column, including the column names, number of entries, whether they have non-null values, and the data type.
id 220 non-null int64

After this, it will display overall how many columns there are per datatype in the dataframe.
dtypes: float64(1), int64(2), object(2)

And finally, it shows the memory usage of the dataframe.
memory usage: 8.7+ KB

4 Likes

What does int64 signify?
does it say that the number of integers that can be entered are, at max, 64?

1 Like

a 64-bit integer (-9223372036854775808 to 9223372036854775807)

basically, 64 bits (either a 1 or 0) are used to represent the number.

  • 1 bit for positive or negative
  • 63 bits for the actual number

so the range is +/- 2^63

17 Likes

Thanks for this post, it’s really useful!
I just have an additional question, the “None” at the end what does it refer to?

3 Likes

Why do the columns that have strings as the datatype show up as ‘object’ instead?

It’s just because you can’t store the actual data in a simple array like you can for other data types (instead you have a series of references to string objects), so you lose some of the fast vectorisation options and each string is treated as a separate object.
There’s some nice discussion here that goes into more detail-

2 Likes

I want to know why it print a none in the end of the info,as well?

Why does the imdb_rating column look out of place in the terminal?

The .info() method returns “None” in the end only if we try to print it out, because it has no return value.
All the information which we see after calling this method is just a set of parameters, they would be displayed in the terminal window with or without the print statement.
And as this method has no return, we see None when try to print it.

a = df.info()
print("""The dataframe info is above, because we called it previously. 
The None value is below, because we try to print the variable containing this method.""")
print(a)
Output:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 12 entries, 0 to 11
Data columns (total 4 columns):
 #   Column       Non-Null Count  Dtype
---  ------       --------------  -----
 0   col1 12 non-null     object
 1   col2      12 non-null     object
 2   col3 12 non-null     object
 3   col4  12 non-null     object
dtypes: object(4)
memory usage: 512.0+ bytes
The dataframe info is above, because we called it previously. 
The None value is below, because we try to print the variable containing this method.
None
2 Likes

Thank you! By far Codecademy is the best place to learn IT and coding

1 Like