Hello! I am following a skill path Analyze Data with R and currently I am a bit stuck with coronavirus off-platform project and its extra challenges. I’ve managed to solve the third extra challenge where you are supposed to create a side-by-side boxplot, but it seems to me my code looks a bit ugly and long. Could you please advise me a more concise manner of arriving to the same result?
Here is what I managed to do:
Firstly, I selected only the needed columns:
confirmed_day_by_day <- confirmed %>%
group_by(`Country/Region`) %>%
select(-Lat, -Long, -`Province/State`)
Then transposed the data:
transposed_conf_day_by_day <- confirmed_day_by_day %>%
t() %>%
as.data.frame()
transposed_conf_day_by_day <- transposed_conf_day_by_day %>%
row_to_names(row_number = 1) %>%
apply(MARGIN = 2, as.numeric)
United columns with the same name into one:
transposed_conf_day_by_day <- t(rowsum(t(transposed_conf_day_by_day), group = colnames(transposed_conf_day_by_day), na.rm = T))
transposed_conf_day_by_day <- as.data.frame(transposed_conf_day_by_day)
Finally I selected only the values for Italy:
italy_day_by_day_conf <- transposed_conf_day_by_day %>%
select(Italy, date) %>%
rename(confirmed = Italy)
Then I did the same with recovered and deaths tables. Finally I joined all the Italy tables into one:
italy_day_by_day <- italy_day_by_day_conf %>%
full_join(italy_day_by_day_recov) %>%
full_join(italy_day_by_day_death)
Then I formatted data into a long format and created a graph
italy_day_by_day_long <- italy_day_by_day %>%
gather(event, total, confirmed:recovered:death)
ggplot(italy_day_by_day_long, aes(x=date, y=total, fill = event)) +
geom_bar(stat = 'identity', position = 'dodge') +
labs(x='Days From January 20, 2020', y='Total', title = 'COVID 19 in Italy')
Here is it. But it feels like there should be a better way of completing this task. I would be very grateful for your help!
Any feedback is also welcome!
Thanks and have a great day!