im pretty lost on this one. imported the data and now im trying to select all the months the are ‘Apr’ yet its only giving me 3 rows from April. the dtype is a object but i have no idea why it is doing this. When i try other months those months dont pop up at all!
Ah, it’s good (IMO) that you switched to using Pandas with this project.
Have you tried
Example: for a project I worked on that looked at HR numbers during a span of years, I wanted to create a separate df for the rows that matched in the year column for the year 1998 (for all the years, actually) from the original df, so I did this:
AL_NL_batting98 = AL_NLbatting98_2008.iloc[(AL_NLbatting98_2008['Year']== 1998).values] and then i used .describe() on the column I was interested in: AL_NL_batting98[['HR']].describe()
The pattern is:
new_df = current_df.iloc[(current_df['column_name'] == whatever_you're_looking_for).values]
which will tell you how large the new df is.
Hope that helps.
Also, Go Warriors!
Thank you Lisa! Im gonna give this one a try tomorrow and hopefully it solves all my problems!!
so i tried your iloc solution and its giving me the same issue. only returning 2 rows. im stumped
i fixed it!!! so apparently in the excel file the were empty spaces not showing up while viewing it? once i deleted those it seems to work now.
How many years’ worth of data is this? 2009-2021? The Date column only features the month and not the entire date, correct?
So, I think this is the issue. You need to have a specific date…for each of the games in every month in every season rather than just the month. How many games did he play in April of each year (minus when he was out b/c he broke his hand in 2019).
Where did you get your data from? I usually like to rely on sports reference dot com. You can download their data in excel format–you can even remove columns you don’t want before you download it, then you can either clean it up in excel and then import the CSV file to jupyter notebook or Google Colab.