Kiva Data with Seaborn Error

Hi, I’m having issues with the Kiva Visualization Project and SeabornURL

The project is on jupyter notebooks, and I get a TypeError even if I use the code which the solution file uses. I have seaborn, python, and pandas installed through the anaconda3 package, and have checked if they are up to date. The error occurs on step 4 and all of the subsequent steps. The code is the following:

from matplotlib import pyplot as plt
import pandas as pd
import seaborn as sns

df = pd.read_csv('kiva_data.csv')
print(df.head(25))

# Creates the figure, note: you're only using this syntax so that you can modify the y-axis ticks later

f, ax = plt.subplots(figsize=(15, 10))
sns.barplot(data = df, x="country", y="loan_amount")
plt.show()

This is the error for the output:

TypeError                                 Traceback (most recent call last)
<ipython-input-14-063bdb217cac> in <module>
      1 # Creates the figure, note: you're only using this syntax so that you can modify the y-axis ticks later
      2 f, ax = plt.subplots(figsize=(15, 10))
----> 3 sns.barplot(data = df, x="country", y="loan_amount")
      4 plt.show()

~\anaconda3\lib\site-packages\seaborn\categorical.py in barplot(x, y, hue, data, order, hue_order, estimator, ci, n_boot, units, seed, orient, color, palette, saturation, errcolor, errwidth, capsize, dodge, ax, **kwargs)
   3148                           estimator, ci, n_boot, units, seed,
   3149                           orient, color, palette, saturation,
-> 3150                           errcolor, errwidth, capsize, dodge)
   3151 
   3152     if ax is None:

~\anaconda3\lib\site-packages\seaborn\categorical.py in __init__(self, x, y, hue, data, order, hue_order, estimator, ci, n_boot, units, seed, orient, color, palette, saturation, errcolor, errwidth, capsize, dodge)
   1615                                  order, hue_order, units)
   1616         self.establish_colors(color, palette, saturation)
-> 1617         self.estimate_statistic(estimator, ci, n_boot, seed)
   1618 
   1619         self.dodge = dodge

~\anaconda3\lib\site-packages\seaborn\categorical.py in estimate_statistic(self, estimator, ci, n_boot, seed)
   1517                                           n_boot=n_boot,
   1518                                           units=unit_data,
-> 1519                                           seed=seed)
   1520                         confint.append(utils.ci(boots, ci))
   1521 

~\anaconda3\lib\site-packages\seaborn\algorithms.py in bootstrap(*args, **kwargs)
     83     for i in range(int(n_boot)):
     84         resampler = integers(0, n, n)
---> 85         sample = [a.take(resampler, axis=0) for a in args]
     86         boot_dist.append(f(*sample, **func_kwargs))
     87     return np.array(boot_dist)

~\anaconda3\lib\site-packages\seaborn\algorithms.py in <listcomp>(.0)
     83     for i in range(int(n_boot)):
     84         resampler = integers(0, n, n)
---> 85         sample = [a.take(resampler, axis=0) for a in args]
     86         boot_dist.append(f(*sample, **func_kwargs))
     87     return np.array(boot_dist)

TypeError: Cannot cast array data from dtype('int64') to dtype('int32') according to the rule 'safe'

Hey there @tera5822344034, try restarting your kernal. I’ve run into the same problem so I simply went ahead with “Restart & Clear” Option on the Kernal dropdown option. Let me know if that Works.

Unfortunately, this hasn’t worked. The error is the same.

Tried updating all of the conda packages and it worked, thanks for the help

Hi,

I have a question related to this excercise’s question “Which country has the least disparity in loan amounts awarded by gender?”

Trying to compute this, I somehow get stuck with the following code:

df2 = df.groupby(['gender', 'country'])['loan_amount'].mean()

Which produces

gender  country    
female  Cambodia       562.903226
        El Salvador    567.454380
        Kenya          345.028323
        Pakistan       427.987875
        Philippines    327.199599
male    Cambodia       747.000000
        El Salvador    605.599711
        Kenya          497.681966
        Pakistan       642.129630
        Philippines    377.699663

How can I now get the difference between the resulting values and store them in a variable called ‘gender_difference’?

The resulting dataframe should look like:

country gender_difference
Cambodia -182
El Salvador -39

and so on. At best sorted

I somehow can’t find the answer here. Tried to transform the dataframe as described in this stackoverflow question but I couldn’t get it running.

@code8952164074 thanks, it works for me.