Project date-a-scientist (OKCupid)

Hi, I really need your suggestions on my project to make it better than its current state. Thanks in advance.

Hello! Nice job on completing this project, I’ll go ahead and get right into my opinions on your work :slight_smile:

First, I would recommend spending more time explaining your goals, the data, and the findings in the introduction/body/conclusion. Such as explaining what specifically your goals are (What ML modeling will you use? What labels do you hope to predict?), what the data shows (What is the size of the dataframe? How many null or duplicates are there? What did you discover from exploratory data analysis?), what is the scope of the project (What visualizations did you create, and what do they represent? Why did you choose those forms of visualizations?), and other things like that.

I thought you very nicely thorough during the EDA, especially with the visualizations. On your height-histogram, perhaps form a range on the x-axis so we only see heights between 50 and 90 inches, that way we can have a more in-depth look at the spread. On the other EDA graphs I would recommend sorting your bars by descending, that way it is immediately clear which is the most or least common.

Nice job with the various ML models, I was eager to see how the decision tree played out for you! I used the Naive Bayes Classifier, but was wondering how the decision tree would perform. I was wondering if you could tell me how you came to the decision to limit the max_depth to 30? I will need to come back to this project in the future and give the decision tree a model a shot on this data.

Great job! I enjoyed viewing your project :slight_smile:

Thanks so much for your feedback. It would be a pleasure for me to receive from time to time your feedback in my work. Really, nice analysis.