Need help to figure out the differences between df.columns and df[columns]; second, why can't I use the number I got from isnull().sum() to calculate the percentage

You must select a tag to post in this category. Please find the tag relating to the section of the course you are on E.g. loops, learn-compatibility

When you ask a question, don’t forget to include a link to the exercise or project you’re dealing with!
https://www.codecademy.com/paths/data-science/tracks/data-processing-pandas/modules/dspath-multiple-tables-pandas/projects/multi-tables-proj

If you want to have the best chances of getting a useful answer quickly, make sure you follow our guidelines about how to ask a good question. That way you’ll be helping everyone – helping people to answer your question and helping others who are stuck to find the question and answer! :slight_smile:

df.columns returns a list of the column headers in your dataframe. df is a method of selecting data from your dataframe. The only way df[columns] will return anything but an error is if you have defined a variable called columns (e.g. columns = [‘a’, ‘b’, ‘c’]) in which the given case it would a return a dataframe with only the columns ‘a’, ‘b’ and ‘c’.

df.isnull() returns a new dataframe filled by booleans (True or False) in place of each cell in df, mapping True if the value is null or False if it is not. Therefore, calling .sum() on this returned dataframe doesn’t work because you cannot sum boolean values.