FAQ: Data Cleaning with Pandas - Reshaping your Data

This community-built FAQ covers the “Reshaping your Data” exercise from the lesson “Data Cleaning with Pandas”.

Paths and Courses
This exercise can be found in the following Codecademy content:

Practical Data Cleaning

FAQs on the exercise Reshaping your Data

There are currently no frequently asked questions associated with this exercise – that’s where you come in! You can contribute to this section by offering your own questions, answers, or clarifications on this exercise. Ask or answer a question by clicking reply (reply) below.

If you’ve had an “aha” moment about the concepts, formatting, syntax, or anything else with this exercise, consider sharing those insights! Teaching others and answering their questions is one of the best ways to learn and stay sharp.

Join the Discussion. Help a fellow learner on their journey.

Ask or answer a question about this exercise by clicking reply (reply) below!

Agree with a comment or answer? Like (like) to up-vote the contribution!

Need broader help or resources? Head here.

Looking for motivation to keep learning? Join our wider discussions.

Learn more about how to use this guide.

Found a bug? Report it!

Have a question about your account or billing? Reach out to our customer support team!

None of the above? Find out where to ask other questions here!

3 posts were split to a new topic: Incorrect hint in the Reshaping your Data exercise in Data Cleaning with Pandas course

For the last part of this exercise, the solution is

print(students.exam.value_counts())

but that results in this error:

Traceback (most recent call last):
File “script.py”, line 10, in
print(students.exam.value_counts())
File “/var/codecademy/runner_contexts/python/cc_python3_shared.py”, line 97, in series_str_with_print
return self._internal_str()
File “/usr/local/lib/python3.6/dist-packages/pandas/core/generic.py”, line 5179, in getattr
return object.getattribute(self, name)
AttributeError: ‘Series’ object has no attribute ‘_internal_str’

what does the error mean?

1 Like

I struggled to understand this exercise for a few minutes before checking the solution code. It would be helpful to clarify that the pd.melt(frame=df_name…) function, apparently cannot be used alone to alter the specified dataframe in place (even though it requires you to specify the dataframe to melt as an argument). Instead, it requires assignment of the dataframe to itself in order to function, i.e. df_name = pd.melt(frame=df_name…).

This distinction was not immediately clear either from the lesson’s instructions or from the hint provided for the second step. The second step does state:

Use pd.melt() to create a new table (still called students ) that follows this structure.

However, it feels like a stretch to interpret that instruction as calling for the assignment when it was not discussed or illustrated anywhere else in the lesson. Hope that helps anyone else that might be confused here.

2 Likes

There is something wrong with the second.
I’ve written the code below and the system does not mark it as a right answer. Now I’ve got code solution and it is the same as I did.

students = pd.melt(frame=students, id_vars=[‘full_name’,‘gender_age’,‘grade’], value_vars=[‘fractions’, ‘probability’], value_name=‘score’, var_name=‘exam’)

My browser is Safari 13.0.5.

It seems to me that there is a relation between df.melt(…) and df.pivot(…). Is the one the inverse of the other? Is there a similarity between them?

I’m unsure why I get an error on Step 3 in Visual Studio Code and on here yet I get the green check mark on here.

import pandas as pd
from students import students

# Print out the columns of students.
for column in students.columns:
    print(column)

# There is a column for the scores on the fractions exam,
# and a column for the scores on the probabilities exam.
# We want to make each row an observation, so we want to transform this table to look like:
# Use pd.melt() to create a new table (still called students) that follows this structure.
students = pd.melt(frame=students, id_vars=['full_name', 'gender_age', 'grade'], value_vars=[
    'fractions', 'probability'], value_name="score", var_name="exam")

# Print the .head() and the .columns of students.
# Also, print out the .value_counts() of the column exam.
print(students.head())
print(students.columns)
print(students.value_count("exam"))

Traceback (most recent call last):
File “script.py”, line 19, in
print(students.value_count(“exam”))
File “/usr/local/lib/python3.6/dist-packages/pandas/core/generic.py”, line 5179, in getattr
return object.getattribute(self, name)
AttributeError: ‘DataFrame’ object has no attribute ‘value_count’

Print the .head() and the .columns of students .
Also, print out the .value_counts() of the column exam .