https://www.codecademy.com/practice/projects/this-is-jeopardy
I’m doing this off-platform with the larger data set. I’m working on “Explore from Here” suggestion 2:
Is there a connection between the round and the category? Are you more likely to find certain categories, like
"Literature"
in Single Jeopardy or Double Jeopardy?
Here are the relevant parts of my code so far:
pd.set_option('display.max_colwidth', -1)
df = pd.read_csv('jeopardy.csv')
df.columns = ['show', 'date', 'round', 'cat', 'val', 'quest', 'ans']
cat_round = df.groupby(['cat', 'round']).quest.count().reset_index()
cat_round_pivot = cat_round.pivot(\
columns= 'round',\
index= 'cat',\
values= 'quest'\
).reset_index()
print(cat_round_pivot)
The output for my print command is good:
round cat Double Jeopardy! Final Jeopardy! Jeopardy! Tiebreaker
0 A JIM CARREY FILM FESTIVAL NaN NaN 5.0 NaN
1 "!" NaN NaN 5.0 NaN
2 "-ARES" 5.0 NaN NaN NaN
3 "-ICIAN" EXPEDITION NaN NaN 5.0 NaN
4 "...OD" WORDS 5.0 NaN NaN NaN
etc.
So, to answer the question, I need to find a row in the table with the category cat == "Literature"
. It should be simple to do this, but nothing I’ve tried works. Here’s what I’ve tried:
litPiv = cat_round_pivot(cat_round_pivot['cat'] == 'Literature').reset_index()
//TypeError: 'DataFrame' object is not callable
litPiv = cat_round_pivot.xs(('Literature')).reset_index()
//KeyError: 'Literature'
print(cat_round_pivot[cat_round_pivot.index == "Literature"])
//KeyError: False
print(cat_round_pivot[cat_round_pivot.index["Literature"]])// IndexError: only integers, slices (':'), ellipsis ('...'), numpy.newaxis ('None') and integer or boolean arrays are valid indices
So: How do I search a pivot table for a specific row?
Note: I figured out how to answer the question without the pivot table, like this (but I still want to know how to use the pivot table):
print(cat_round[cat_round.cat == 'Literature'])
but no results came up:
Empty DataFrame
Columns: [cat, round, quest]
Index: []
This result seems unlikely, as I’ve watched Jeopardy before and think there must have been a category called Literature in the 30 years or so the data set covers.
Update: Putting ‘LITERATURE’ (in all caps) gave the expected result:
cat round quest
16377 LITERATURE Double Jeopardy! 381
16378 LITERATURE Final Jeopardy! 10
16379 LITERATURE Jeopardy! 105
However, when I try it with a category I know for sure is in there:
print(cat_round[cat_round.cat == 'eBay'])
cat round quest
31660 eBay Double Jeopardy! 5
So I’m still a bit mystified. Any help/explanation would be welcome!