https://github.com/marqidox/presentation/blob/main/presentation.ipynb
Check it out!
Cool! Good on you for finding a dataset and doing some EDA!
A histogram might be useful here too in order to show the distribution of the prices and user ratings (mean, etc.). You already used .describe()
Also, it’s nice to see Orwell in the top 5. (I think.)
This would be using the statistics from .describe()
?
You can create one using matplotlib & numpy:
No no. I meant using the mean that I got from .describe()
(among other stats)?
.plt.hist
does that for you. Check out the link.
I actually found using DataFrame.hist()
works too in the link you sent! Thanks
top50Books['User Rating'].hist(
grid=True,
bins='auto',
rwidth=0.9,
color='blue'
)
plt.xlabel('User Rating')
plt.ylabel('Frequency')
plt.title('User Rating Histogram')
plt.show()
top50Books['Price'].hist(
grid=True,
bins='auto',
rwidth=0.9,
color='red'
)
plt.xlabel('Price')
plt.ylabel('Frequency')
plt.title('Price Histogram')
Nice!!
There’s always more than one way to accomplish something! (which I think is neat).
So looking at the two, they are unimodal histograms. What does that tell you about a set of values?
One is right skewed and one is left skewed.
https://blog.prepscholar.com/skewed-right