I was just reviewing some older stuff (or rather, checking out the new Data Science paths, which I think came out really cool!) and stumbled upon the Page Visits Funnel project. I realized that the numbers don’t add up for me, checked out Matt’s solution and saw that his numbers don’t end up either.
Matt starts out with 2052 people visiting and 400 people using the cart in Task 4/5. By Task 6 the number of people using the cart grows to a 602 people and by the end of the exercise the total numbers of visitors reaches a mind boggling 2594. But none of that seems to discomfort Matt in any way, as he keeps on explaining and calculating confidently?!
Obviously there are duplicate user_ids in the data sets which, if this is on purpose, you should definitely address and explain. For my own set of data, the checkout.csv has more rows than the cart.csv, which is nonsensical to begin with.
Now I see that the issue with the numbers not adding up has been raised in the forums before (example from 2019, from 2020, and 2021), but this was a really long time ago (more than one and a half years?) and nothing seems to have changed by that time. Also no one seems to have pointed out that the sample solution is wrong in itself.
I am not asking for a work around here (nunique() is not the answer) but to explain to me, please, how the example solution is not staggeringly wrong and, if it really is (and again, maybe I am wrong here, as I am just learning), why such things are not being corrected? The issue has been raised long ago.
Hi there! Thank you for your reply. Yes, I actually did the quality report thing. I guess I have to elaborate a little further though: 1) the video in itself is wrong! I am not saying that the numbers just differ to my numbers, I am saying the numbers in the solution don’t match up and it is funny to watch an experienced developer calculating something that sums up to nonsense and getting it presented as a sample solution. This is not anything I would normally expect from codecademy. 2) If checkout has more rows than cart, it does not mean that people abandoned their items, it actually means that some people proceeded to the checkout without using their cart in the first place. Maybe this is something you can actually do at Amazon, as of lately (Buy it now!), but I am quite sure it was not what the people behind Cool T-Shirts Inc. had intended writing this project.
The project is circa 2017 (that’s when I originally did it. Now, my code won’t run b/c I get a module error b/c the code behind it is so old/outdated). I’m thinking there have been inconsistent updates to it along the way.
Who knows. Just proves that there are people behind the scenes who are human and therefor, fallible. We are all human, right?
Regardless, there’s probably a lengthy bug report queue and depending on a number of things during each phase of development (agile methodologies) things are prioritized…so it would seem that this isn’t a high priority project(?). Or, maybe no one from CC reads the forums(?).
Hmm… I certainly don’t blame anyone for making mistakes, please don’t get me wrong on this one. I’m just wondering what conclusions to draw from the fact that such a mistake can go unnoticed for several years. Also I would assume that the newly revamped and extended pro(!) path can’t be all that low on cc’s priority…
Again, maybe it is me who got something wrong and I am just being dumb, actually this is why I turned to the forum in the first place. I hoped someone could either confirm that both the project as well as the walk-through are flawed (and have been flawed since at least 2018, as that’s when the video was posted to YouTube) or point out to me where I am mistaken.
I don’t know what’s on their project maps or what their priorities are.
I can say from my own personal work experience from working at a tech corp (and working on bugs) that bug queues can get lengthy and get ignored (unless it’s a code red/huge/priority 1 issue) until someone has the time to devote to investigate and work on them and then roll out the corrections to production.
Maybe the more that people report an issue, the more attention it will get to correct the issues.
Hey, I have a question and I can’t find the answer - how can it be that purchase_time in purchase has 252 non-null entries and the checkout_time in checkout has only 226 non-null entries?
What I don’t understand is how more people bought a t-shirt than went to checkout?
Also, they seem to have changed the files in the zip file lately (jupyter) and the video seems to be gone altogether.