Hello, I’m looking for feedback on this portfolio project. I created some functions inside the class and used some that I had built previously. My idea was creating a class that allowed for further research of statistics, as I don’t feel quite ready to interpret analysis of data quite yet. I hope to come back to this project in the future and make some interesting changes, that I have in mind now but can’t quite execute with what I know so far.
It’s good that you’re thinking ahead. The project appears at multiple points in the course, so you will be returning to it with new knowledge. Refresh my memory: At this part in the course, you haven’t been introduced to Pandas or NumPy (or SciPy?) yet, right? But you have been introduced to Matplotlib? I ask b/c SciPy library has a stats module which you can import for statistical analysis. And, Pandas has summary statistics built in. Just something to keep in mind. Good idea on creating that class for further analysis though!
Some thoughts:
Remember, this is just a random sample of data and you’ve not done any statistical testing on it, so, saying stuff like this, ‘Can you afford to make these decisions ?! Let’s find out !’ is a little too early on in the process. Just something to keep in mind when writing the intro to your notebook.
One thing to keep in mind is to remember who the audience is that you are presenting this info to. If they’re not technically savvy or have a stats background, those concepts might go a little over their heads and need more of an explanation (as to why you want to know variance or the coefficient of variations, etc.) That’s (a longwinded way of) me saying: know how to present technical information to a non-technical audience. You can do that with a little bit of an explanation…
It’s a good idea to add a little text (in cells or #comments ) before the code cells…b/c you’re telling a story about the data and want ppl who aren’t familiar with it to be able to easily follow along with your thought processes.
I might disagree with the assertion that having children doesn’t raise one’s insurance rates. The data is out there that says it has gone up since at least 1999 (by 131%!) and, “The average premium for family coverage has increased 20% over the last five years and 43% over the last ten years.”
See: https://www.kff.org/health-costs/report/2022-employer-health-benefits-survey/
As well as the costs of childbirth. Women who have children pay more than women who do not have children too.
I think it’s also always good practice to do a cursory review of data and publications out there in addition to doing analysis on whatever data set one is working on.
Good work on completing the project! Happy coding!
Thank you very much for taking the time to give me this great feedback. I’m yet to learn Pandas and Numpy through Codecademy’s course, but I have been introduced to both before. I did one version of this project using Pandas but after reading what the project was requesting, I thought I was supposed to practice creating classes, functions and handling files, so I spent some time creating this class. About how I presented the data, I’m definitely not proud of how it came out, I haven’t learned how to make sense of and portray data correctly yet and it’s one of my main problems. I will keep learning and fixing the project as I go, it will be good to show myself where I was at when I started and how much improvement I have made, helps keeping me motivated. Thank you again for all the points you have made, I will use them as directions for future updates.
I think you did a great job. And, I think you’re right; at this point you’re supposed to do the project with creating classes–which you did very well! You’re absolutely right: keep going back to the data an improving on the analysis when you know more. Once you learn Seaborn (or Plotly) there’s lots to be done with visualization as well.