I just completed my final portfolio project for Data Science Path in Machine Learning and would appreciate any feedback.
I found this project very challenging and yet rich in learning experiences. I coud use the majority of the knowledge and skills acquired through the course in just one project, which was great.
Based on some initial ideas I decided to explore one regression and one classification problem, which led me to use a good variety of models and analyses: Simple and Multiple Regressors, PCA, Decision Trees, Random Forests and SVMs.
I had very interesting results, altough scores were not great, but the learning provided was excellent, and the results just demonstrate to me how diverse in behavior and habits people really are.
Great effort and well done on trying different algorithms.
I like how you tried to address multiple problems from the dataset.
Just a minor question, it doesn’t look like you tested whether the model is overfitting. This might be one of the reasons your model didn’t perform better than it did.
Otherwise, well done!
thanks for your feedback on my work!
You are right and thanks for pointing that out! Although I did check for overfitting in the regression models, I forgot to do it for the classification ones (maybe I assumed they had some sort of embedded regularization).
Still I ran train and test scores for both Random Forest Classifier and Support Vector Classifier.
Random Forest indeed gave a slight difference in train and test scores (0.72 and 0.69 respectively), showing some overfitting, , but I think that this difference is not large enough to pursue an hyperparameter tuning for regularization, in the scope of this work.
With SVC the scores were very similar, indicating no overfitting.
Still, a very important point you remembered, I forgot completely.
Thanks again for your review!