Pandas DataFrame: define function to filter rows based on column contents (This Is Jeopardy)

Hi, I’m working on writing a function to filter rows in a pandas dataframe jeopardy_data based on the contents of its ’ Questions’ column (renamed to ‘question’ in my project).

I wrote a function that works for single strings:

def question_keyword_filter(dataframe, string):
	return dataframe[dataframe['question']\
	.str.contains('{}'.format(string), case=False)]

but it fails to parse lists as the ‘string’ variable. Specifically, it returns all the rows in the DataFrame.
I attempted to modify it such as:

def question_keyword_filter(dataframe, string):
	searchfor = '{}'.format(string)
	return dataframe[dataframe['question']\
	.str.contains('|'.join(searchfor), case=False)]

But this doesn’t work at all…

Does the Series.str.contains() method work for lists? Is the solution simply defining my local variable ‘searchfor’ differently for Series.contains(’|’.join(searchfor)? Or, could I write the function in such a way that it iterates through a list if the ‘string’ variable given is a list?

I’m just going to copy the solution code so I can keep moving for now, but I’m curious if there is a legitimate way to modify the function in my approach so that it can take lists of strings as input.

Exercise:
https://www.codecademy.com/paths/data-science/tracks/dscp-data-manipulation-with-pandas/modules/dacp-data-manipulation-challenge-projects/projects/this-is-jeopardy

Found a bit of a bizarre solution eventually.

def question_keyword_filter(dataframe, string):
	if isinstance(string, list):
		searchfor = string
		search_mask = (dataframe['question'].str.contains(string, case=False) for string in searchfor)
		combined_mask = np.vstack(search_mask).all(axis = 0)
		return dataframe[combined_mask]
	else:
		return dataframe[dataframe['question']\
		.str.contains('{}'.format(string), case=False)]

I’m very happy with this code. Function works perfectly for the ‘AND’ condition when given a list of strings as input for the ‘string’ variable. For the ‘OR’ condition fixing the ‘searchfor’ local variable was all that was needed.

def question_keyword_filter(dataframe, string):
	if isinstance(string, list):
		searchfor = string
		return dataframe[dataframe['question']\
		.str.contains('|'.join(searchfor), case=False)]
	else:
		return dataframe[dataframe['question']\
		.str.contains('{}'.format(string), case=False)]

I’m leaving this in case anyone else looks for a similar solution while working this project. <3