Marketing Attribution

I have a slight misunderstanding with the Marketing Attribution CoolTshirts project.
In task number 6, where we count how many last touches on the purchase page is each campaign responsible for, the solution provided by both the hint and the walkthrough puts the required WHERE clause to the first subquery counting last touches, code looking like this:

with last_touch as (
select user_id, max(timestamp) as ‘ts’
from page_visits
where page_name = ‘4 - purchase’
group by 1
),
lt_utm as (
select lt.user_id, lt.ts, pv.utm_campaign, pv.utm_source
from last_touch lt
join page_visits pv
on lt.user_id = pv.user_id
and lt.ts = pv.timestamp
)
select utm_campaign, utm_source, count(*)
from lt_utm
group by 1, 2
order by 3 desc
;

My solution was a little bit more cumbersome and apparently has a flaw, which I can’t figure out. I add the page_name column to the join subquery and the neccessary WHERE clause to the end of the query, code looks like this:

with last_touch as (
select user_id, max(timestamp) as ‘ts’
from page_visits
group by 1
),
lt_utm as (
select lt.user_id, lt.ts, pv.utm_campaign, pv.utm_source, pv.page_name
from last_touch lt
join page_visits pv
on lt.user_id = pv.user_id
and lt.ts = pv.timestamp
)
select utm_campaign, utm_source, count(*)
from lt_utm
where page_name = ‘4 - purchase’
group by 1, 2
order by 3 desc
;

Result is nearly the same, difference is only one user in each of the top three campaigns missing in my solution. I understand that the provided solution is the correct one, but I don’t understand why.

Can anyone explain, where does my solution lose these three users?
Thanks.

2 Likes

I believe that since there are users that after purchasing an item viewed the website again thier max(timestamp) is no longer the purchase page_name. And since the where statement is processed before the group by statement that sale is still counted, since they filter on only looking at page name = ‘4 - purchase’

2 Likes

yep that is the case:

WITH buyers AS
(SELECT user_id as buyer, page_name as buyerpage
FROM page_visits
WHERE page_name = ‘4 - purchase’),

last_activity AS
(SELECT user_id, max(timestamp), page_name as last_activity
FROM page_visits
GROUP BY 1),

aggregation AS
(SELECT *
FROM last_activity
LEFT JOIN buyers
ON
buyers.buyer = last_activity.user_id)

SELECT user_id, last_activity, buyerpage
FROM aggregation
WHERE last_activity != ‘4 - purchase’ AND buyerpage = ‘4 - purchase’;

1 Like