About the Portfolio Project: Data Scientist Final category

The Data Scientist Final Project

Welcome to the subcategory for the Data Scientist Final Project. This portfolio project can be found in the following courses or paths:

  • Data Scientist Career Path

How to Get Feedback on your Project

Congratulations on finishing your portfolio project! Now you’ll want to follow these steps to get feedback on it.

  1. Post a link to your git repository :slight_smile:
  2. Give us a few sentences about your experience. Was this fun? Difficult? How long did it take?
  3. Check back in—if someone has replied to your post, come see what they have to say.

How to Give Feedback on Another Learner’s Project

Reviewing someone else’s code isn’t just a nice thing to do; it’s also a great opportunity to sharpen your skills by viewing a different perspective.

  1. Refer to the article in your Career Path on How to Review Someone Else’s Code
  2. Click through topics in this subcategory to view other submissions of this project.
  3. Reply to a thread with feedback, encouragement, or letting them know if they did something in a way you hadn’t thought of before!

https://github.com/cletusDnorton/Codecademy_Pima_Indian_final

This is my data science final project. I used data from UCI machine learning on Kaggle to try and predict diabetes in Pima Indians. I found this challenging, but fun. Please let me know any comments.

Thanks,
Cletus

Here’s my Data Science Final Project!

I used data from a brasil.io, which creates databases on Brazil’s national issues. I have analyzed how deputies spend and request for reimbursement.
It was super fun doing this project, the CSV file is hug, it was my first time dealing with such an amount of data. It ended up being quite time consuming, and not very predictive, I used two models, both performed quite poorly (53% accuracy vs 43% accuracy) but the RandomForestClassifier performed better than a DNN! I don’t know why, I’ll keep investigating but I thought that maybe publicizing it would be better. Hope my project is stimulating to everyone!

Please check my Data Scientist Final Portfolio Project:

It took 3 days to complete. Working on this project was fun but sometimes challenging. I am happy with the results because, as for my opinion, I have built a model that is able to give pretty accurate predictions.

Please, feel free to give me any critic or comments.

My Project on Fantasy Premier league was fun and challenging to finish.
My objective in this project get the stats about the player and choosing the players squad for next game week.
Kindly provide your valuable feedback

Here is the GitHub repository for my project, on Taiwanese credit card data.

This is from the ReadMe:

My First Independent Project Using Data Downloaded from Kaggle

This is my first independent project. It is probably a bit on the long side, but I wanted to practice doing different things with Python. A summary slideshow can be found here.

This project uses credit card data for 30,000 anonymous credit card clients in Taiwan. It presents information on the customers (education, sex, marital status, age) and their behavior for a six-month period from April 2005 to September 2005. The information was downloaded in CSV format from Kaggle, and according to the information provided it was uploaded from the UC Irvine Machine Learning Repository.

The project has two parts, one which involves training and evaluating different machine learning models. The second part runs statistical tests and a few bootstrapping and permutation simulations on the data.

I’d really appreciate getting feedback and criticism from others.

Best,
Derek Baker

Hello all,

Below is a link to my GitHub repository for my final project, analysis of NFL Draft data from 1990-2022. Any feedback is welcome, and greatly appreciated. Blessings to all.

Hello guys!

I hope you’re doing well. I downloaded a cancer dataset from Kaggle and did a project about prediction the type of cancer. It took a week for me to do the project and it was an interesting experience. I got the best result from AdaBoost classifier. Since the data was slightly imbalanced, I used the SMOTE technique to balance the dataset, but it didn’t show a marked improvement. I also used PCA to see if the performance gets better or not. I summarized the results of all my models in tables and compared them with four metrics (Accuracy, Recall, Precision and F1-score).

I would be so grateful and happy if you give me your feedbacks.
I uploaded my code on the github :

My code