Hey People!
I need your feedback for my project, I use Pandas library too here
actually I needed maybe 7-8 hours to do this project, but I did it everyday for 2-3 hours, it makes my head hurts first, but in the end is very satisfying
here it is
Hey People!
I need your feedback for my project, I use Pandas library too here
actually I needed maybe 7-8 hours to do this project, but I did it everyday for 2-3 hours, it makes my head hurts first, but in the end is very satisfying
here it is
Congrats on finishing the project.
A few things— if you’re importing pandas, you might as well use it. You can create a dataframe by:
df = pd.read_csv("filename.csv")
Some other useful methods:
df.info()
# to get the data types of the cols.
df.head()
# first 5 rows of df
df.shape
# number of cols x rows
df.columns
# col. names
df.isnull().sum()
# checks for nulls
Descriptive stats:
df.describe()
# count, mean, frequency, std, min, max stats on all the cols
df['column_name'].describe()
df['sex'].value_counts()
# will give you the number of values for each
insurance[["age", "sex", 'charges']].groupby('sex').mean()
# the mean (or use median) for each col, grouped by female/male:
To break out by region use .iloc()
:
southwest = insurance.iloc[(insurance['region']=='southwest').values]
etc…etc.
See the Pandas documentation if you ever get stuck.
thank you, I will definitely use panda again! Thanks again for the tips