I have a question about the last steps of PCA Project in “Data Science: Machine Learning Specialist” path.
In step 13 and 14, the exercise asks to use the 2 PCA components as input to a support vector classifier, find the score and compare this score with the one obtained using 2 random features from the original standardized data matrix in the same classfier. It suggests to “Notice the large difference in scores”…but the large difference is not in PCA components favor al all:
Score for model with 2 PCA features: 0.8649036163772503
Score for model with 2 randomly selected features: 0.9993627529074398
What is that supposed to mean? That in this case is better using a support vector classifier with any random features of the original matrix, without doing PCA?
I can’t understand this output in a project that aims to show the pros in using PCA
Am I missing something?
Thanks for your helpfulness