ML/AI Engineer Portfolio Project Feedback

Hello everyone. This portfolio project was challenging and took me exactly 2 weeks. I paid attention to implementing a machine learning workflow on this dataset. I tried to use everything Codecademy taught me and made a detailed analysis. I would be very happy if you review and give criticism and suggestions. Thanks in advance. Portfolio Project

:grinning: Happy coding!!!

@lisalisaj hey sorry to bother you but I was wondering if you could leave feedback about my project. Your comments are very important to me

Congrats on finishing this project.

Where did the data come from? By that, I mean the original source? (I didn’t see one on Kaggle). Was it survey data of college-age students in India (since it seems all the cities are located there)?

  • I think it should be noted somewhere (in the title, firstly) that since the data comes from (presumably) randomly sampled student populations in different states & territories in India, that this predictive model can only be applied to this particular population and cannot be extrapolated to student populations as a whole (in different countries).

  • The notebook is clear, in that, it is laid out well, interspersed w/comments, so people can easily follow along with your thought process & analysis.

  • The section where you look at possible correlation between thoughts & depression–This bit is confusing: “The results show that individuals with s. thoughts have significantly different (and probably higher) depression scores. A p-value of 0.0 indicates that this relationship is highly statistically significant . In other words, there is clearly no causal relationship between the two conditions.”
    If it is statistically significant, then there is a correlation there.

  • Minor: When you reference dietary habits and “healthy” v. “unhealthy”, what do they mean? Those are subjective terms. Unless those are the options in the original data questionnaire(?). Admittedly, the potential links between diet & depression are interesting (to me).

  • I like that you included this part in your conclusion:
    “One of the most important lessons that can be drawn from this analysis is that machine learning models are only tools and it is essential to use these tools in the right context. While the project demonstrated how data science approaches can be effective in understanding a complex issue such as depression, it was also a reminder that such models need to be combined with human expertise. While the accuracy of the model is impressive, it should be noted that further testing and refinement is required to translate this success into real-world applications.”

Good work.

(And thank you for asking for my opinion. :slight_smile: )

2 Likes

Firstly thank you for a detailed review :slightly_smiling_face:

  • The description of this dataset does not say exactly where it was collected, but I think it was probably collected from students and people of various professions in India.

  • Yes, actually I should have mentioned that this model can be applied for students from India. Thank you very much for your comment on this issue.

  • I was very careful to write my comments especially under the codes because sometimes it can be really confusing

  • Yes, you are right about that part. In other words, there is clearly no causal relationship between the two conditions.”
    I probably wrote this part by mistake. I will correct it immediately. The correct way should be as follows In other words, there is clearly causal relationship between the two conditions.”

df["Dietary Habits"].value_counts().to_frame()
count
Dietary Habits	
Unhealthy	10282
Moderate	9890
Healthy	7625
Others	12
  • In fact, I used it that way because it was originally mentioned in the data set, but do you think such a subjective use is wrong? (I am asking to learn)

  • I’m glad you liked a section in the conclusion.

I will immediately make the changes you mentioned and correct the mistakes I made.
Thank you again. Have a good day :smile:

1 Like

It is subjective, but that’s how it was in the data and that’s not something you can control. Maybe “unhealthy” means a lot of fried food, or, highly processed foods or something?. (I’m not sure).
If you had designed the questionnaire, then you could maybe break it down into different types of diets–vegetarian, vegan, omnivore, pescatarian, etc. That’s a bit more precise definition of dietary habits.

You’re welcome! :slight_smile:

1 Like

In the meantime, I would like to ask you something else. I have finished the ml/ai engineer career path and I wonder which course, skill or career path would be right for me now? I want to be a machine learning engineer.

Did you take a Python course? Or a statistics course? If not, I suggest studying those two areas as well.

1 Like

Thanks, I’ll definitely look into these two areas especially statistics.

1 Like