Exercise Link
Are there basic rules for when different types of brackets should be used, and how they should be compounded? For example:
df[(df.age < 30) |
(df.name == 'Martha Jones')]
and
df[df.name.isin(['Martha Jones',
'Rose Tyler',
'Amy Pond'])]
The above examples use compounds of ()
and []
brackets, and I’m unsure of the underlying rules that determine which order you should apply different type of bracket in the context of Pandas dataframes.
Thanks!
Understandably, it can be confusing. Yep, there are rules and they’re contained in the documentation. Which, upon reading can seem a bit “soupy” at first, but then with practice, it becomes more clear. I will add that I think the cheatsheet for the lesson is lacking a lot of information.
df[(df.age < 30) |
(df.name == 'Martha Jones')]
Here, you are searching for rows in a column [age]
in a df.
Directions: " For instance, suppose we wanted to select all rows where the customer’s age was under 30 or the customer’s name was “Martha Jones”"
When you want to access a column in a pandas df you use brackets. df[(df.age *then insert logical expression here*)]
BUT, you will notice that here you’re using a “.” to access the column inside the paraenthesis. In this case it’s sort of shorthand (for lack of a better word that I can think of b/c I haven’t had enough
yet) to access an index on a series or column.
The documentation tells us this:
https://pandas.pydata.org/docs/user_guide/indexing.html#attribute-access
For the second example:
df[df.name.isin(['Martha Jones',
'Rose Tyler',
'Amy Pond'])]
.isin()
is a method (& methods use parens) that searches rows in a df for values. In your example, you’re passing a list [ ]
as the parameter. (You’re looking for rows that contain those three names).
The structure for that is:
DataFrame.isin(values)
The value that you pass through can be a series, iterable, dictionary, or dataframe.
That info is here:
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.isin.html
Happy coding! 