Page Visits Funnel

In the project walkthrough video, the merged dataframe between visits and cart has 2052 rows, but when I print the merged dataframe it has 2000 rows. What’s happening?

https://www.codecademy.com/paths/data-science/tracks/dscp-data-manipulation-with-pandas/modules/dscp-multiple-tables-in-pandas/projects/multi-tables-proj

Seems like this one has come up before-

But I used the same syntax as the guy in the video. And if 2052 and 2000 both come up, which is the correct number to use?

Can you provide your code? If you used the same code as the video it should come out to 2052 rows. It still does for me, so I don’t think the data has changed since the video was made.

import codecademylib
import pandas as pd
# -- inspecting the dataframes --
visits = pd.read_csv('visits.csv',
                     parse_dates=[1])
#print(visits) 
cart = pd.read_csv('cart.csv',
                   parse_dates=[1])
#print(cart)
checkout = pd.read_csv('checkout.csv',
                       parse_dates=[1])
#print(checkout)
purchase = pd.read_csv('purchase.csv',
                       parse_dates=[1])
#print(purchase)
# ----------------------------------------------------
def mergeCal(df,df2):
  merged = pd.merge(df,df2,how='left') #left merge on df and df2
  info_col = merged.columns[-1] # the _time column
  null_rows = merged[merged[info_col].isnull()] # null rows 
  percentage = float(len(null_rows))/len(merged)*100 # dividing number of null rows by number of rows in merged df 
  return merged,percentage,len(null_rows)

visitsCart,visitsCart_per,visitsCart_null = mergeCal(visits,cart)

Well, that is certainly not the same code as the video, but your DataFrame called visitsCart still comes out to 2052 rows for me. Perhaps you are looking at a printout for the visits df?

Ah sorry, I meant the same methods, slightly differed ofc because I had to use them for a function.

No, I just printed it using

print(visitsCart.shape)
print(len(visitsCart))

Both came out to 2000 rows.

Well this is what I’m seeing:

Either you’re doing something different or they changed the data for newer users (or maybe with the revamp of the DS path) and let users who previously completed it keep the old data.

They only way to know for sure would be to download the csv files and compare, but I’m leaning toward it being something on your end.

Can we try this? Because I didn’t change anything, even resetting the project and re-running my code. I still got 2000 rows.

Click the folder in the top left of the code section:
image

Then open up the visits.csv and carts.csv like this:

Then click the share button:
image

Share the link here and I’ll take a look when I have a minute.

1 Like

visits:https://gist.github.com/a698653c546afaa2de3304372cccfdc1

cart:https://gist.github.com/920d6011f108c0d23eb8ad1020d65f7f

@h1lo,

Well it looks like we’ve solved the mystery. Your cart.csv only has 349 rows, whereas mine has 401.
Looks like at some point they updated the data…whether this was intentional or not is anyone’s guess.

Here’s a REPL if you wanna check out the difference:
https://repl.it/@elCocodrilo/PageVisitsFunnelTest#main.py

1 Like

Thank you so much! I’m not sure which version I should use, so I’ll just stick with the new version for now.