[DS] Rollercoaster Project extended

Hi all,

I’ve been working on this codecademy project (roller coasters) for a while and have been extending it (hence why I post it here).

I haven’t been completely true to the actual project and mostly went my own way here and would love some feedback on the project. In particular what I’m struggling with so far.

I wanted to plot the rollercoasters featured in all six years of the df in a single plot. I’ve iterated over the years and found which rollercoasters are present in all years. I’m looking for a streamlined way of creating a subsection of the df to just pick these rollercoasters without adding them manually:

df_wood = winners_wood[(winners_wood['Name'] == 'Boulder Dash')|(winners_wood['Name'] == 'El Toro')\
                       |(winners_wood['Name'] == 'Phoenix')|(winners_wood['Name'] == 'Thunderhead')\
                       |(winners_wood['Name'] == 'Ravine Flyer II')|(winners_wood['Name'] == 'Outlaw Run')]

I thought about a lambda function that compares if a rollercoaster is present in all six years of the df, but haven’t figured out anything that works (I usually end up with a series of all rollercoasters).

Further down the notebook I’ve selected the rollercoasters of the ‘Efteling’ (a theme park in the Netherlands) and did a few violin plots of the height, speed and length of rollercoasters in that theme park as well the distribution of all rollercoasters in the csv.

I’ve managed to plot these against each other in a KDE (I’m not sure if I did this correctly), but I would like to create violin plots of these values (height, speed, length) for both global and Efteling, in the same plot (split with hue, and in the same violin (so left Efteling, right global). I’m struggling to get the x-values right for that one, as I’ve essentially dumped the violin plots the same way as with a KDE.

(as in the total_bill graph split for male/female in the documentation here:

Any feedback is appreciated. Don’t hold back!

Thanks in advance,