Kindly requesting a walkthrough for the museums and nature centers project in R

Hi I have been stuck on step 10 for almost a month now and cannot figure out what on earth is wrong with my code. Specifically, steps 7-10 are giving me the most difficulty and there is no walkthrough and cannot find the answer in the codecademy forums. I am not sure if there is a bug within the problem but I have even reset it and cannot progress past 7-10 because the stacked bar plot does not update.

This is the intermediate R course using ggplot2

Steps 7-10 are:

  1. Our data also contains information on each museum’s region, representing groups of states. Create a stacked bar plot using museums_df showing the count of museums by region ( Region.Code..AAM. ), mapping Is.Museum to the fill aesthetic. Convert Region.Code..AAM. to a factor (e.g. factor(Region.Code..AAM.) ) so ggplot2 plots its levels as discrete rather than continuous values. Call this plot museum_stacked .

  2. Our plot is hard to read – right now, we don’t know what the region numbers correspond to. Use scale_x_discrete() to rename the numeric labels to text according to the following table.
    Similarly, add a scale_fill_discrete() layer to relabel the “TRUE” and “FALSE” labels in our legend to “Museum” and “Non-Museum”.

Based on the plot we created, which region has the most museums?

  1. Rather than seeing counts, perhaps we’re more interested in the percentage of museums vs non-museums by region. Transform the plot we just created to a stacked bar plot showing values out of 100% by passing position = "fill" to our geom_bar() layer. Apply the scales::percent_format() function to transform our y axis labels into percentage values.

How does the distribution of museum types vary by region?

  1. Our graph looks pretty good! However, our axes titles are a little non-descript. Using the labs() layer, let’s title this plot “Museum Types by Region”, relabel the x axis title as “Region”, relabel the y axis title as “Percentage of Total”, and relabel the fill legend title as “Type”.

Now, someone can take a look at this plot and immediately understand what is being described. There were a lot of steps here, but now our plot is clear and professional. Give yourself a pat on the back, and feel free to take a 10 minute coffee break before the next section!

This is the code I have on step 10:

museum_stacked <- ggplot(museums_df,
    aes(x=factor(Region.Code..AAM.), 
    fill=Is.Museum),
    position = "fill") +
  geom_bar( 
  scale_x_discrete(
    labels = c(
      "1" = "New England", 
      "2" = "Mid-Atlantic", 
      "3" = "Southeastern", 
      "4" = "Midwest", 
      "5" = "Mountain Plains", 
      "6" = "Western")) + 
  scale_fill_discrete(
    labels = c(
      "FALSE" = "Non-Museum", 
      "TRUE" = "Museum"))
  scale_y_continuous(
    scales::percent_format()) + 
  labs(title = "Museum Types by Region", x = "Region", y = "Percentage of Total", fill = "Type"))
museum_stacked

this is the output graph:

project:
https://www.codecademy.com/paths/analyze-data-with-r/tracks/data-visualization-in-r-skill-path/modules/intermediate-data-visualization-with-ggplot-2/projects/data-visualization-in-r-museums

(admittedly, I don’t know R)
But, is this a matter of clearing plots so the newest one will show?
(Something like plt.clf() in matplotlib and seaborn)

Maybe @sophsommer3 could better answer your question? She definitely knows R!
:slight_smile:

Hi,

I will definitely reach out for help. thanks!

My question is not so much on clearing plots but updating them which for some reason is not being reflected in the output.

Naw, don’t DM her. By tagging her I meant that she’d reply here. :slight_smile:
Conversations/threads like these benefit the community b/c someone else might have encountered the same issue that you are having with this project.

I should learn R (it’s on my to-do list) so I can help out more people here! :smiley:

2 Likes

Sorry, I just saw this! Here is solution code for that exercise:

# Create and print stacked bar plot
museum_stacked <- 
  ggplot(museums_df, 
    aes(
      x = factor(Region.Code..AAM.), 
      fill = Is.Museum)) + 
  geom_bar(position = "fill") + 
  scale_x_discrete(
    labels = c(
      "1"="New England",
      "2"="Mid-Atlantic", 
      "3"="Southeastern",
      "4"="Midwest", 
      "5"="Mountain Plains", 
      "6"="Western")) + 
  scale_y_continuous(
    labels = scales::percent_format()) +
  scale_fill_discrete(
    labels = c(
      "TRUE" = "Museum", 
      "FALSE" = "Non-Museum")) + 
  labs(
    title = "Museum Types by Region",
    x = "Region",
    y = "Percentage of Total",
    fill = "Type"
  )
  
museum_stacked
2 Likes

I also just added a full solutions file to the project as solutions.Rmd in the workspace. If you temporarily copy your work somewhere else (eg., a google doc so that you don’t lose it), then go to “Get Unstuck” and “Reset Exercise”, you should be able to see it! (Then you can copy your work back into the workspace). Let me know if that works – apologies that we don’t have a walkthrough video yet!

4 Likes

Thank you so much @sophsommer3 ! I really appreciate your help! It did work and was able to move on.

2 Likes