Python - Unable to add column

Please help me out with this!

I’m using dataset from U.S Medical Insurance and working on Jupyter Notebook. I couldn’t add in a column that averages out 2 columns (please view the last row of code here.)

Can someone point me out where did i do wrongly? :woozy_face:
All comments are appreciated!

Thank you

@raysonng, I see what you were trying to do here, but you ran into the exact reason why using indexing in Pandas (square bracket access) is generally preferred over attribute (dot) access.

You are getting this error because you are trying to divide by df_totalcharges.count.
What you think you are doing is dividing by the df_totalcharges column named count.
However, Pandas interprets this as you trying to divide by the un-invoked pandas.DataFrame.count method. Hence the error saying you can’t divide a float by a method.

So, to get what you want, you should use this line:

df_totalcharges['avg_charges'] = df_totalcharges['charges'] / df_totalcharges['count']

Moving forward, remember that dot access in Pandas is a shortcut — but shortcuts aren’t always the best choice for every situation.


Works like magic. Thank you!

I’m still new to Python and been struggling with when to use brackets, square brackets and dot access.

If I may ask a follow-up question, i tried to plot a bar chart:

I figured Pandas is not recognizing my “region” column, and i tried to rename the column but its not working. I could use the “region” from another dataframe, but would like to understand what mistake I made that result in this error.

Thank you in advance.

Pandas doesn’t recognize a region column in df_totalcharges, because no such column exists.

When you created df_totalcharges using this code…

totalcharges = data.groupby('region').charges.sum()
df_totalcharges = pd.DataFrame(totalcharges)

… you made the region column your index. So now you don’t have a region column, but your index is the values of your previous region column.

So, there are two ways you can go about making your bar chart: you can use the DataFrame’s index as your x values, or you can go back and use groupby() in a way that doesn’t make your index the region values. Documentation on how to do that is here:

1 Like

Hi @el_cocodrilo, I managed to solve the task by adding as_index=False when I use groupby.

Thank you for your enlightenment! Appreciate it!

1 Like