Pandas DataFrame [[ ]] versus Series [ ]

In the DecisionTree exercise “Find the Flag” we are asked to extract our labels from the flags Dataframe by using:
labels = flags[[“Landmass”]]

I am aware that using the double squares brackets returns a DataFrame instead of a Pandas Series. But I am not sure whether we need the DataFrame in this case as we only want to extract a single column.

Is there an advantage to working with a DataFrame even when it is only a single column?

I would otherwise go for: labels = flags[“Landmass”]

Thanks in advance.

1 Like

I think either one is fine. Since train_test_split() does not change the shape of array, it affects the second parameter of the .fit() and .score() methods. But both methods allow both 1D and 2D arrays as the second parameter:

y : array-like of shape (n_samples,) or (n_samples, n_outputs)