During one of the lessons I created a pivot table of which the int values became float.
I researched about it on the internet for quite some time and I guess it was due to that one of the values became nan which transformed them all to float (my guess, correct me if I am wrong, please)
So I tried to solve it by using fillna, but it didn’t solve anything about the dtype.
I paste the code here:
shoe_counts = orders.groupby(['shoe_type', 'shoe_color']).id.count().reset_index()
shoe_counts_pivot = shoe_counts.pivot(
I tried to check on the internet how could I change it back and I found “astype(int)” which was shown with a lot of df but none with pivot. So I tried to put that astype(int) to all the places I thought could make any sense and tried separately with shoe_counts_pivot[“id”] as well hoping that python will remember that the values came from that column (which was just silly but was worth a try haha)
In the end I just changed the type of each column separately, but with many columns this would be pretty painful.
I used this:
shoe_counts_pivot[["brown", "black", "navy", "red", "white"]] = shoe_counts_pivot[["brown", "black", "navy", "red", "white"]].astype(int)
So I am pretty sure this can be solved so much easier than I did it and I am wondering if anyone could assist me with this
Also as a plus request if I did the above code (the astype(int) with all the colors) as a loop which is ridiculous I know but I am curious, how would I be able to do that?
Knowing the pivot table looks like this:
I don’t think there are any NANs in this df example. So, I would suggest not changing the data type and leave it as float.
This was my code:
import numpy as np
import pandas as pd
orders = pd.read_csv('orders.csv')
shoe_counts = orders\
shoe_counts_pivot = shoe_counts.pivot(columns='shoe_color', index='shoe_type', values='id').reset_index()
When working with data, to see the various datatypes in the DF, you would first do:
df.info(), which would return something along these lines:
RangeIndex: 13487 entries, 0 to 13486
Data columns (total 21 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Year 13487 non-null int64
1 League 13487 non-null object
2 Name 13487 non-null object
3 Age 13465 non-null float64
4 Team 13465 non-null object
5 G 13487 non-null int64
6 PA 13487 non-null int64
7 AB 13487 non-null int64
8 R 13487 non-null int64
Good on you for researching how to make data type changes. You’re close, but if you want to change a data type, the syntax is:
df['colname'] = df['colname'].astype(int) (Or whatever type you’re changing it to).
Ah. I guess I missed that.
Regardless…I wouldn’t change the floats to int b/c of Python and math.