A question regarding Seaborn and using DataFrames

Hello everyone,

I had a question regarding Seaborn and using a DataFrame. Here’s the link’s exercise I am doing and below is the code. My question is regarding the very last part of the code in #Add your code below:

Take in the data from the CSVs as NumPy arrays:

set_one = np.genfromtxt(“dataset1.csv”, delimiter=",")
set_two = np.genfromtxt(“dataset2.csv”, delimiter=",")
set_three = np.genfromtxt(“dataset3.csv”, delimiter=",")
set_four = np.genfromtxt(“dataset4.csv”, delimiter=",")

Creating a Pandas DataFrame:

n=500
df = pd.DataFrame({
“label”: [“set_one”] * n + [“set_two”] * n + [“set_three”] * n + [“set_four”] * n,
“value”: np.concatenate([set_one, set_two, set_three, set_four])
})

Setting styles:

sns.set_style(“darkgrid”)
sns.set_palette(“pastel”)

Add your code below:

sns.kdeplot(set_one, shade=True)
sns.kdeplot(set_two, shade=True)
sns.kdeplot(set_three, shade=True)
sns.kdeplot(set_four, shade=True)
plt.show()

Here’s my question - how does Seaborn known that it has to fetch the columns from ‘df’? I actually got the exercise wrong because I tried to specify the DataFrame by writing df.column_name:

sns.kdeplot(df.set_one, shade=True)
sns.kdeplot(df.set_two, shade=True)
sns.kdeplot(df.set_three, shade=True)
sns.kdeplot(df.set_four, shade=True)
plt.show()

@frankjimenez93270473,

Seaborn doesn’t know to do that. In this example, the accepted code is taking the numpy arrays that you defined at the top and using those as the datasets, it is not magically pulling out the columns from the Pandas DataFrame.

Your code is actually perfectly valid, but it is likely that Codecademy set up a test to check the code and that test specifically looks for the numpy arrays to be passed to kdeplot rather than the DataFrame columns. In practice, either way would be fine.