after filtering the data, I did two pretty simple tasks: average costs for smokers/non_smokers and grouping average costs by age group. ps give examples of other tasks that I could do with this data.
Well, what do you think are some other good things to compare, look at for analysis?
-
What columns/variables are in the data?
-
Rather than look at avg. or mean cost, it might be better to look at median b/c there are outliers in the data (if you plot the charges, you will see that or if you use
.min()
and.max()
on thecharges
col.) -
Is this the version of the project using the
csv
library orpandas
?