Questions about my Marketing Attribution Project - CoolTShirts

Hello fellow learners!

I’m on the Data Science career path, and just completed the SQL queries for the Marketing Attribution project. My responses are written under the related code and results. I have so many questions… would love some feedback!

I took a long time to get through some of these questions, especially #3. I don’t understand why the second “with” statement is needed instead of adding columns to the first statement. I reviewed the lesson to no avail, and ended up doing it a different way. Also, once the table is grouped by utm_campaign, don’t the user_id and first_touch_at columns no longer apply? If this table were presented to anyone other than the coder, wouldn’t it confuse them that only four user ids are being linked with aggregated data?

Regarding question #4, I left out the second “with” statement due to the same confusion as with question #3. Also, I’m struggling with understanding the difference between “last touches” and visits to the purchase page. I know that not every user’s last touch will be at the purchase page, but won’t every visit to the purchase page be a last touch for a distinct user? My results show a 3-user discrepancy in the COUNT column, and I don’t know exactly why. Although for the purpose of this project, it doesn’t change the conclusions drawn.

Thank you for any imput!

SELECT COUNT(DISTINCT utm_campaign) AS 'Number of Campaigns'
FROM page_visits;

SELECT COUNT(DISTINCT utm_source) AS 'Number of Sources'
FROM page_visits;

SELECT DISTINCT utm_campaign,
  utm_source
FROM page_visits
ORDER BY utm_source;

Number of Campaigns
8

Number of Sources
6

utm_campaign utm_source
ten-crazy-cool-tshirts-facts buzzfeed
weekly-newsletter email
retargetting-campaign email
retargetting-ad facebook
paid-search google
cool-tshirts-search google
interview-with-cool-tshirts-founder medium
getting-to-know-cool-tshirts nytimes

– 1. There are 8 campaigns and 6 sources for CoolTShirts’ marketing.

SELECT DISTINCT page_name
FROM page_visits;

page_name
1 - landing_page
2 - shopping_cart
3 - checkout
4 - purchase

– 2. Customers will go through 4 pages at CoolTShirts: (1) landing page, (2) shopping cart, (3) checkout, and (4) purchase.

WITH first_touch AS (
    SELECT user_id,
        MIN(timestamp) as first_touch_at
    FROM page_visits
    GROUP BY user_id)
SELECT ft.user_id,
    ft.first_touch_at,
    pv.utm_source,
    pv.utm_campaign,
    COUNT(utm_campaign)
FROM first_touch ft
JOIN page_visits pv
    ON ft.user_id = pv.user_id
    AND ft.first_touch_at = pv.timestamp
GROUP BY utm_campaign
ORDER BY 5 DESC;
user_id first_touch_at utm_source utm_campaign COUNT(utm_campaign)
99990 2018-01-13 23:30:09 medium interview-with-cool-tshirts-founder 622
99933 2018-01-25 00:04:39 nytimes getting-to-know-cool-tshirts 612
99765 2018-01-04 05:59:46 buzzfeed ten-crazy-cool-tshirts-facts 576
99684 2018-01-13 13:20:49 google cool-tshirts-search 169

– 3. The most first touches came from the interview-with-cool-tshirts-founder campaign, at 622 first touches. This was followed closely by the getting-to-know-cool-tshirts campaign and the ten-crazy-cool-tshirts-facts campaign. It appears that longer-form first exposures garnered more actionable interest, as twice as many first touches came from longer articles than from campaigns that required less time or attention.

WITH last_touch AS (
    SELECT user_id,
        MAX(timestamp) as last_touch_at
    FROM page_visits
    GROUP BY user_id)
SELECT lt.user_id,
    lt.last_touch_at,
    pv.utm_source,
    pv.utm_campaign,
    pv.page_name,
    COUNT(utm_campaign)
FROM last_touch lt
JOIN page_visits pv
    ON lt.user_id = pv.user_id
    AND lt.last_touch_at = pv.timestamp
GROUP BY 4, 5
ORDER BY 6 DESC;
user_id last_touch_at utm_source utm_campaign page_name COUNT(utm_campaign)
99843 2018-01-23 05:23:08 email weekly-newsletter 3 - checkout 333
99928 2018-01-24 05:26:09 facebook retargetting-ad 3 - checkout 331
99990 2018-01-16 11:35:09 email retargetting-campaign 3 - checkout 192
99589 2018-01-15 04:55:43 nytimes getting-to-know-cool-tshirts 2 - shopping_cart 153
99769 2018-01-25 21:06:56 medium interview-with-cool-tshirts-founder 2 - shopping_cart 132
98840 2018-01-10 04:58:48 google paid-search 3 - checkout 126
99765 2018-01-04 05:59:47 buzzfeed ten-crazy-cool-tshirts-facts 2 - shopping_cart 126
99933 2018-01-26 06:18:39 email weekly-newsletter 4 - purchase 114
99897 2018-01-06 09:41:19 facebook retargetting-ad 4 - purchase 112
99285 2018-01-24 09:00:58 email retargetting-campaign 4 - purchase 53
94567 2018-01-19 16:37:58 google paid-search 4 - purchase 52
99344 2018-01-18 21:36:32 google cool-tshirts-search 2 - shopping_cart 46
95496 2018-01-27 09:42:08 nytimes getting-to-know-cool-tshirts 1 - landing_page 38
98408 2018-01-03 10:01:14 nytimes getting-to-know-cool-tshirts 3 - checkout 32
98361 2018-01-22 20:26:39 buzzfeed ten-crazy-cool-tshirts-facts 1 - landing_page 32
99838 2018-01-02 07:40:34 medium interview-with-cool-tshirts-founder 3 - checkout 24
99520 2018-01-03 03:16:37 buzzfeed ten-crazy-cool-tshirts-facts 3 - checkout 23
94168 2018-01-22 22:07:47 medium interview-with-cool-tshirts-founder 1 - landing_page 21
92172 2018-01-16 15:15:29 nytimes getting-to-know-cool-tshirts 4 - purchase 9
98651 2018-01-15 04:17:36 buzzfeed ten-crazy-cool-tshirts-facts 4 - purchase 9
96308 2018-01-15 13:04:56 google cool-tshirts-search 1 - landing_page 7
83547 2018-01-10 18:20:21 medium interview-with-cool-tshirts-founder 4 - purchase 7
61012 2018-01-16 06:49:07 google cool-tshirts-search 3 - checkout 5
95650 2018-01-18 00:25:00 google cool-tshirts-search 4 - purchase 2

– 4. When observing last touches, we see that there are twice as many campaigns engaging users by the end of their journey. The campaigns leading to the most last touches are weekly newsletter and retargetting ad.

SELECT COUNT (DISTINCT user_id)
FROM page_visits
WHERE page_name = '4 - purchase';

SELECT utm_source,
    utm_campaign,
    page_name,
    COUNT(DISTINCT user_id)
FROM page_visits
WHERE page_name = '4 - purchase'
GROUP BY utm_campaign
ORDER BY 4 DESC;

COUNT (DISTINCT user_id)
361

utm_source utm_campaign page_name COUNT(DISTINCT user_id)
email weekly-newsletter 4 - purchase 115
facebook retargetting-ad 4 - purchase 113
email retargetting-campaign 4 - purchase 54
google paid-search 4 - purchase 52
nytimes getting-to-know-cool-tshirts 4 - purchase 9
buzzfeed ten-crazy-cool-tshirts-facts 4 - purchase 9
medium interview-with-cool-tshirts-founder 4 - purchase 7
google cool-tshirts-search 4 - purchase 2

– 5. Of all the visitors to the CoolTShirts website, 361 make a purchase. This final step can be attributed mostly to the weekly newsletter campaign and retargetting ad campaign.

WITH last_touch AS (
    SELECT user_id,
        MAX(timestamp) as last_touch_at
    FROM page_visits
    GROUP BY user_id)
SELECT lt.user_id,
    lt.last_touch_at,
    pv.utm_source,
    pv.utm_campaign,
    pv.page_name,
    COUNT(lt.last_touch_at)
FROM last_touch lt
JOIN page_visits pv
    ON lt.user_id = pv.user_id
    AND lt.last_touch_at = pv.timestamp
WHERE pv.page_name = '4 - purchase'
GROUP BY 4
ORDER BY 6 DESC;
user_id last_touch_at utm_source utm_campaign page_name COUNT(lt.last_touch_at)
99933 2018-01-26 06:18:39 email weekly-newsletter 4 - purchase 114
99897 2018-01-06 09:41:19 facebook retargetting-ad 4 - purchase 112
99285 2018-01-24 09:00:58 email retargetting-campaign 4 - purchase 53
94567 2018-01-19 16:37:58 google paid-search 4 - purchase 52
92172 2018-01-16 15:15:29 nytimes getting-to-know-cool-tshirts 4 - purchase 9
98651 2018-01-15 04:17:36 buzzfeed ten-crazy-cool-tshirts-facts 4 - purchase 9
83547 2018-01-10 18:20:21 medium interview-with-cool-tshirts-founder 4 - purchase 7
95650 2018-01-18 00:25:00 google cool-tshirts-search 4 - purchase 2

– 6. Similar to the previous inquiry, the weekly newsletter campaign and retargeting ad campaign are responsible for significantly more last touches on the purchase page than all other campaigns combined.

SELECT utm_campaign,
  page_name,
  COUNT (DISTINCT user_id)
FROM page_visits
GROUP BY 2, 1
ORDER BY 2, 3 DESC;
utm_campaign page_name COUNT (DISTINCT user_id)
interview-with-cool-tshirts-founder 1 - landing_page 625
getting-to-know-cool-tshirts 1 - landing_page 616
ten-crazy-cool-tshirts-facts 1 - landing_page 582
cool-tshirts-search 1 - landing_page 171
getting-to-know-cool-tshirts 2 - shopping_cart 680
ten-crazy-cool-tshirts-facts 2 - shopping_cart 566
interview-with-cool-tshirts-founder 2 - shopping_cart 515
cool-tshirts-search 2 - shopping_cart 133
weekly-newsletter 3 - checkout 450
retargetting-ad 3 - checkout 445
retargetting-campaign 3 - checkout 246
paid-search 3 - checkout 179
getting-to-know-cool-tshirts 3 - checkout 41
ten-crazy-cool-tshirts-facts 3 - checkout 32
interview-with-cool-tshirts-founder 3 - checkout 31
cool-tshirts-search 3 - checkout 7
weekly-newsletter 4 - purchase 115
retargetting-ad 4 - purchase 113
retargetting-campaign 4 - purchase 54
paid-search 4 - purchase 52
getting-to-know-cool-tshirts 4 - purchase 9
ten-crazy-cool-tshirts-facts 4 - purchase 9
interview-with-cool-tshirts-founder 4 - purchase 7

– 7. Looking at which campaigns were responsible for more touches at different points in the user’s journey, we can see which campaigns are more worthwhile to re-invest in. The top three campaigns for drawing users to the landing page and shopping cart: (1) interview with cooltshirts founder, (2) getting to know cooltshirts, & (3) ten crazy cooltshirts facts. And the top two campaigns for drawing users to checkout and to actually purchase: (4) weekly newsletter & (5) retargetting ad.

– Further analysis with usage funnels would be interesting to determine how users moved from one page to another across various campaigns.