I’m currently working on the Machine learning capstone project and i have this big question how do one define the features or labels of a data set? For example logistic regression requires features and labels but in a dataset what is a feature and what is a label?
I have the same question in my mind: when working with a dataset (like the Titanic one, from one of the projects), how we choose the features to work on and get better results?
Features are also known as predictors or independent variables. They are used as inputs to make a prediction. Labels are the true values of the outputs you are trying to predict.
You can try plotting the different features versus the labels in to see if there are any patterns. You can also simply try adding different features into the model and seeing if the accuracy improves.