Marketing Attribution: WHERE Clause

LESSON: https://www.codecademy.com/paths/data-science/tracks/analyze-data-sql-analyze-real-data/modules/analyze-data-sql-marketing-attribution/projects/analyze-data-sql-attribution-queries

Why do I get different answers depending on where I put the WHERE clause in this exercise? The solution says to put it at the top in the first temporary table but I first thought to put it at the bottom and get a slightly different answer. Would love to know why. Thanks!

WHERE at the end of the query:

WITH last_touch AS (
  SELECT user_id, 
         MAX(timestamp) AS 'last_touch_at'
  FROM page_visits
  GROUP BY user_id
),
lt_attr as (
  SELECt lt.user_id,
         lt.last_touch_at,
         pv.utm_campaign,
         pv.utm_source,
         pv.page_name
  FROM last_touch AS 'lt'
  JOIN page_visits AS 'pv' 
  ON lt.user_id = pv.user_id
  AND lt.last_touch_at = pv.timestamp
)
SELECT lt_attr.utm_campaign,
       lt_attr.utm_source, 
       COUNT(DISTINCT lt_attr.user_id)
FROM lt_attr
WHERE page_name = '4 - purchase'
GROUP BY 1, 2
ORDER BY 3 DESC;

WHERE at the top of the query:

WITH last_touch AS (
  SELECT user_id, 
         MAX(timestamp) AS 'last_touch_at'
  FROM page_visits
  WHERE page_name = '4 - purchase'
  GROUP BY user_id
),
lt_attr as (
  SELECt lt.user_id,
         lt.last_touch_at,
         pv.utm_campaign,
         pv.utm_source,
         pv.page_name
  FROM last_touch AS 'lt'
  JOIN page_visits AS 'pv' 
  ON lt.user_id = pv.user_id
  AND lt.last_touch_at = pv.timestamp
)
SELECT lt_attr.utm_campaign,
       lt_attr.utm_source, 
       COUNT(*)
FROM lt_attr
GROUP BY 1, 2
ORDER BY 3 DESC;

@kathrou77,

Welcome to the forums!

I actually wrote out a very detailed answer to this exact question a few months ago, which you can find here:

The post goes over a shorter answer as well as a deep-dive into the “why.” I think I covered everything in that post, but if you have any further questions, feel free to reply and ask them!

Happy coding!

1 Like