Pandas pivot changes int to float, how to set it back to int?

During one of the lessons I created a pivot table of which the int values became float.
I researched about it on the internet for quite some time and I guess it was due to that one of the values became nan which transformed them all to float (my guess, correct me if I am wrong, please)
So I tried to solve it by using fillna, but it didn’t solve anything about the dtype.

I paste the code here:

shoe_counts = orders.groupby(['shoe_type', 'shoe_color']).id.count().reset_index() shoe_counts_pivot = shoe_counts.pivot( columns="shoe_color", index="shoe_type", values="id" ).reset_index().fillna(0)

I tried to check on the internet how could I change it back and I found “astype(int)” which was shown with a lot of df but none with pivot. So I tried to put that astype(int) to all the places I thought could make any sense and tried separately with shoe_counts_pivot[“id”] as well hoping that python will remember that the values came from that column (which was just silly but was worth a try haha)
In the end I just changed the type of each column separately, but with many columns this would be pretty painful.

I used this:

shoe_counts_pivot[["brown", "black", "navy", "red", "white"]] = shoe_counts_pivot[["brown", "black", "navy", "red", "white"]].astype(int)

So I am pretty sure this can be solved so much easier than I did it and I am wondering if anyone could assist me with this :sweat_smile:

Also as a plus request if I did the above code (the astype(int) with all the colors) as a loop which is ridiculous I know but I am curious, how would I be able to do that?
Knowing the pivot table looks like this:

I don’t think there are any NANs in this df example. So, I would suggest not changing the data type and leave it as float.

This was my code:

import codecademylib
import numpy as np
import pandas as pd

orders = pd.read_csv('orders.csv')

shoe_counts = orders\
    .groupby(['shoe_type', 'shoe_color'])\

shoe_counts_pivot = shoe_counts.pivot(columns='shoe_color', index='shoe_type', values='id').reset_index()


When working with data, to see the various datatypes in the DF, you would first do:, which would return something along these lines:

class 'pandas.core.frame.DataFrame'>
RangeIndex: 13487 entries, 0 to 13486
Data columns (total 21 columns):
 #   Column   Non-Null Count  Dtype  
---  ------   --------------  -----  
 0   Year     13487 non-null  int64  
 1   League   13487 non-null  object 
 2   Name     13487 non-null  object 
 3   Age      13465 non-null  float64
 4   Team     13465 non-null  object 
 5   G        13487 non-null  int64  
 6   PA       13487 non-null  int64  
 7   AB       13487 non-null  int64  
 8   R        13487 non-null  int64  

Good on you for researching how to make data type changes. You’re close, but if you want to change a data type, the syntax is:

df['colname'] = df['colname'].astype(int) (Or whatever type you’re changing it to).


1 Like

Hi there and thank you for your reply!
With all respect I do want to point out that there is a nan value… After I ran your code as well the pivot table is as follows:
As you can see in here wedges-black is a nan value.
Also if you just print the orders table you can see that it really does not have black wedges in the original table either:
As I used fillna on my pivot table mine has 0 at that value that you have seen in my original post. I wished to achieve that look with less code about converting the float into int as I had to write all my columns’ names there to switch them back to int.
Obviously having it float instead of being int is not going to change anything with calculations but with integers it is much more aesthetic for which I wanted to find an easier way than mine to rewrite these values on this pivot table to int.
And I do really appreciate that you’re trying to help me, so thank you!! :smile:

1 Like

Ah. I guess I missed that.
Regardless…I wouldn’t change the floats to int b/c of Python and math. :slight_smile:

1 Like