There are currently no frequently asked questions associated with this exercise – that’s where you come in! You can contribute to this section by offering your own questions, answers, or clarifications on this exercise. Ask or answer a question by clicking reply () below.
If you’ve had an “aha” moment about the concepts, formatting, syntax, or anything else with this exercise, consider sharing those insights! Teaching others and answering their questions is one of the best ways to learn and stay sharp.
Join the Discussion. Help a fellow learner on their journey.
Ask or answer a question about this exercise by clicking reply () below!
Agree with a comment or answer? Like () to up-vote the contribution!
This is my first response post so please take what I say with a pinch of salt!
I think:
.count() method has already been applied in the earlier groupby() statement so the Dataframe stored as shoe_counts has a column named ‘id’ with values generated by id.count()
Therefore no need to apply .count() again.
This may seem like a dumb question, but how are you supposed to know which columns go into the ‘column’, ‘index’ and ‘value’ elements of a pivot table. The lesson doesn’t do a very good job of explaining and I don’t understand why “shoe_type” and “shoe_color” are not interchangeable when creating the new Dataframe
They are interchangeable, it’s just a matter of how you what your data to be displayed some ways are just more useful or clear than others.
Because our main focus is on shoe types and how well they’re selling depending on their color, it’s better if shoe_type is used as our reference column (or index) and shoe_color is used as the columns you will compare. Since we’re used to read from left to right, it just provides a better comparison and understanding of our data.
Actually, I think it isn’t the same id. You’re selecting id from the latter table, in which count is already applied, and doesn’t have anything to do with the previous table.
what does the reset_index() do at the end of shoe_counts_pivot ? I noticed that without it shoe_type( before black) wasn’t outputed. Smbody plz explain
I guess you could say reset_index() pushes the leftmost column up one column to the right so the column name is visible. So in this case, the ‘shoe_type’ column name was hidden and when we use reset_index, it pushes it one space to the right and now we can see it. That’s how I understood it
Hi there!
The shoe_type column becomes our index, yes. But when we print the dataframe without the reset_index() method, it’s not visibly obvious that the index column is supposed to be a shoe type. By using the reset_index() method, we push the type column to the right so that the column name shows up.
Plus, having a clear numerical index can be very helpful if the pivot table has hundreds of elements!
Try printing the dataframe with and without the reset_index() method to better understand it! I hope I was helpful
To add to this, the numerical index is actually useful if we use methods like .iloc to locate and select specific rows based on the numerical index. You would also use them for slicing methods like [x:y]. You don’t want to get rid of them as they serve a useful purpose.