Hello fellow learners!
I’m on the Data Science career path, and just completed the SQL queries for the Marketing Attribution project. My responses are written under the related code and results. I have so many questions… would love some feedback!
I took a long time to get through some of these questions, especially #3. I don’t understand why the second “with” statement is needed instead of adding columns to the first statement. I reviewed the lesson to no avail, and ended up doing it a different way. Also, once the table is grouped by utm_campaign, don’t the user_id and first_touch_at columns no longer apply? If this table were presented to anyone other than the coder, wouldn’t it confuse them that only four user ids are being linked with aggregated data?
Regarding question #4, I left out the second “with” statement due to the same confusion as with question #3. Also, I’m struggling with understanding the difference between “last touches” and visits to the purchase page. I know that not every user’s last touch will be at the purchase page, but won’t every visit to the purchase page be a last touch for a distinct user? My results show a 3-user discrepancy in the COUNT column, and I don’t know exactly why. Although for the purpose of this project, it doesn’t change the conclusions drawn.
Thank you for any imput!
SELECT COUNT(DISTINCT utm_campaign) AS 'Number of Campaigns'
FROM page_visits;
SELECT COUNT(DISTINCT utm_source) AS 'Number of Sources'
FROM page_visits;
SELECT DISTINCT utm_campaign,
utm_source
FROM page_visits
ORDER BY utm_source;
Number of Campaigns
8
Number of Sources
6
utm_campaign | utm_source |
---|---|
ten-crazy-cool-tshirts-facts | buzzfeed |
weekly-newsletter | |
retargetting-campaign | |
retargetting-ad | |
paid-search | |
cool-tshirts-search | |
interview-with-cool-tshirts-founder | medium |
getting-to-know-cool-tshirts | nytimes |
– 1. There are 8 campaigns and 6 sources for CoolTShirts’ marketing.
SELECT DISTINCT page_name
FROM page_visits;
page_name
1 - landing_page
2 - shopping_cart
3 - checkout
4 - purchase
– 2. Customers will go through 4 pages at CoolTShirts: (1) landing page, (2) shopping cart, (3) checkout, and (4) purchase.
WITH first_touch AS (
SELECT user_id,
MIN(timestamp) as first_touch_at
FROM page_visits
GROUP BY user_id)
SELECT ft.user_id,
ft.first_touch_at,
pv.utm_source,
pv.utm_campaign,
COUNT(utm_campaign)
FROM first_touch ft
JOIN page_visits pv
ON ft.user_id = pv.user_id
AND ft.first_touch_at = pv.timestamp
GROUP BY utm_campaign
ORDER BY 5 DESC;
user_id | first_touch_at | utm_source | utm_campaign | COUNT(utm_campaign) |
---|---|---|---|---|
99990 | 2018-01-13 23:30:09 | medium | interview-with-cool-tshirts-founder | 622 |
99933 | 2018-01-25 00:04:39 | nytimes | getting-to-know-cool-tshirts | 612 |
99765 | 2018-01-04 05:59:46 | buzzfeed | ten-crazy-cool-tshirts-facts | 576 |
99684 | 2018-01-13 13:20:49 | cool-tshirts-search | 169 |
– 3. The most first touches came from the interview-with-cool-tshirts-founder campaign, at 622 first touches. This was followed closely by the getting-to-know-cool-tshirts campaign and the ten-crazy-cool-tshirts-facts campaign. It appears that longer-form first exposures garnered more actionable interest, as twice as many first touches came from longer articles than from campaigns that required less time or attention.
WITH last_touch AS (
SELECT user_id,
MAX(timestamp) as last_touch_at
FROM page_visits
GROUP BY user_id)
SELECT lt.user_id,
lt.last_touch_at,
pv.utm_source,
pv.utm_campaign,
pv.page_name,
COUNT(utm_campaign)
FROM last_touch lt
JOIN page_visits pv
ON lt.user_id = pv.user_id
AND lt.last_touch_at = pv.timestamp
GROUP BY 4, 5
ORDER BY 6 DESC;
user_id | last_touch_at | utm_source | utm_campaign | page_name | COUNT(utm_campaign) |
---|---|---|---|---|---|
99843 | 2018-01-23 05:23:08 | weekly-newsletter | 3 - checkout | 333 | |
99928 | 2018-01-24 05:26:09 | retargetting-ad | 3 - checkout | 331 | |
99990 | 2018-01-16 11:35:09 | retargetting-campaign | 3 - checkout | 192 | |
99589 | 2018-01-15 04:55:43 | nytimes | getting-to-know-cool-tshirts | 2 - shopping_cart | 153 |
99769 | 2018-01-25 21:06:56 | medium | interview-with-cool-tshirts-founder | 2 - shopping_cart | 132 |
98840 | 2018-01-10 04:58:48 | paid-search | 3 - checkout | 126 | |
99765 | 2018-01-04 05:59:47 | buzzfeed | ten-crazy-cool-tshirts-facts | 2 - shopping_cart | 126 |
99933 | 2018-01-26 06:18:39 | weekly-newsletter | 4 - purchase | 114 | |
99897 | 2018-01-06 09:41:19 | retargetting-ad | 4 - purchase | 112 | |
99285 | 2018-01-24 09:00:58 | retargetting-campaign | 4 - purchase | 53 | |
94567 | 2018-01-19 16:37:58 | paid-search | 4 - purchase | 52 | |
99344 | 2018-01-18 21:36:32 | cool-tshirts-search | 2 - shopping_cart | 46 | |
95496 | 2018-01-27 09:42:08 | nytimes | getting-to-know-cool-tshirts | 1 - landing_page | 38 |
98408 | 2018-01-03 10:01:14 | nytimes | getting-to-know-cool-tshirts | 3 - checkout | 32 |
98361 | 2018-01-22 20:26:39 | buzzfeed | ten-crazy-cool-tshirts-facts | 1 - landing_page | 32 |
99838 | 2018-01-02 07:40:34 | medium | interview-with-cool-tshirts-founder | 3 - checkout | 24 |
99520 | 2018-01-03 03:16:37 | buzzfeed | ten-crazy-cool-tshirts-facts | 3 - checkout | 23 |
94168 | 2018-01-22 22:07:47 | medium | interview-with-cool-tshirts-founder | 1 - landing_page | 21 |
92172 | 2018-01-16 15:15:29 | nytimes | getting-to-know-cool-tshirts | 4 - purchase | 9 |
98651 | 2018-01-15 04:17:36 | buzzfeed | ten-crazy-cool-tshirts-facts | 4 - purchase | 9 |
96308 | 2018-01-15 13:04:56 | cool-tshirts-search | 1 - landing_page | 7 | |
83547 | 2018-01-10 18:20:21 | medium | interview-with-cool-tshirts-founder | 4 - purchase | 7 |
61012 | 2018-01-16 06:49:07 | cool-tshirts-search | 3 - checkout | 5 | |
95650 | 2018-01-18 00:25:00 | cool-tshirts-search | 4 - purchase | 2 |
– 4. When observing last touches, we see that there are twice as many campaigns engaging users by the end of their journey. The campaigns leading to the most last touches are weekly newsletter and retargetting ad.
SELECT COUNT (DISTINCT user_id)
FROM page_visits
WHERE page_name = '4 - purchase';
SELECT utm_source,
utm_campaign,
page_name,
COUNT(DISTINCT user_id)
FROM page_visits
WHERE page_name = '4 - purchase'
GROUP BY utm_campaign
ORDER BY 4 DESC;
COUNT (DISTINCT user_id)
361
utm_source | utm_campaign | page_name | COUNT(DISTINCT user_id) |
---|---|---|---|
weekly-newsletter | 4 - purchase | 115 | |
retargetting-ad | 4 - purchase | 113 | |
retargetting-campaign | 4 - purchase | 54 | |
paid-search | 4 - purchase | 52 | |
nytimes | getting-to-know-cool-tshirts | 4 - purchase | 9 |
buzzfeed | ten-crazy-cool-tshirts-facts | 4 - purchase | 9 |
medium | interview-with-cool-tshirts-founder | 4 - purchase | 7 |
cool-tshirts-search | 4 - purchase | 2 |
– 5. Of all the visitors to the CoolTShirts website, 361 make a purchase. This final step can be attributed mostly to the weekly newsletter campaign and retargetting ad campaign.
WITH last_touch AS (
SELECT user_id,
MAX(timestamp) as last_touch_at
FROM page_visits
GROUP BY user_id)
SELECT lt.user_id,
lt.last_touch_at,
pv.utm_source,
pv.utm_campaign,
pv.page_name,
COUNT(lt.last_touch_at)
FROM last_touch lt
JOIN page_visits pv
ON lt.user_id = pv.user_id
AND lt.last_touch_at = pv.timestamp
WHERE pv.page_name = '4 - purchase'
GROUP BY 4
ORDER BY 6 DESC;
user_id | last_touch_at | utm_source | utm_campaign | page_name | COUNT(lt.last_touch_at) |
---|---|---|---|---|---|
99933 | 2018-01-26 06:18:39 | weekly-newsletter | 4 - purchase | 114 | |
99897 | 2018-01-06 09:41:19 | retargetting-ad | 4 - purchase | 112 | |
99285 | 2018-01-24 09:00:58 | retargetting-campaign | 4 - purchase | 53 | |
94567 | 2018-01-19 16:37:58 | paid-search | 4 - purchase | 52 | |
92172 | 2018-01-16 15:15:29 | nytimes | getting-to-know-cool-tshirts | 4 - purchase | 9 |
98651 | 2018-01-15 04:17:36 | buzzfeed | ten-crazy-cool-tshirts-facts | 4 - purchase | 9 |
83547 | 2018-01-10 18:20:21 | medium | interview-with-cool-tshirts-founder | 4 - purchase | 7 |
95650 | 2018-01-18 00:25:00 | cool-tshirts-search | 4 - purchase | 2 |
– 6. Similar to the previous inquiry, the weekly newsletter campaign and retargeting ad campaign are responsible for significantly more last touches on the purchase page than all other campaigns combined.
SELECT utm_campaign,
page_name,
COUNT (DISTINCT user_id)
FROM page_visits
GROUP BY 2, 1
ORDER BY 2, 3 DESC;
utm_campaign | page_name | COUNT (DISTINCT user_id) |
---|---|---|
interview-with-cool-tshirts-founder | 1 - landing_page | 625 |
getting-to-know-cool-tshirts | 1 - landing_page | 616 |
ten-crazy-cool-tshirts-facts | 1 - landing_page | 582 |
cool-tshirts-search | 1 - landing_page | 171 |
getting-to-know-cool-tshirts | 2 - shopping_cart | 680 |
ten-crazy-cool-tshirts-facts | 2 - shopping_cart | 566 |
interview-with-cool-tshirts-founder | 2 - shopping_cart | 515 |
cool-tshirts-search | 2 - shopping_cart | 133 |
weekly-newsletter | 3 - checkout | 450 |
retargetting-ad | 3 - checkout | 445 |
retargetting-campaign | 3 - checkout | 246 |
paid-search | 3 - checkout | 179 |
getting-to-know-cool-tshirts | 3 - checkout | 41 |
ten-crazy-cool-tshirts-facts | 3 - checkout | 32 |
interview-with-cool-tshirts-founder | 3 - checkout | 31 |
cool-tshirts-search | 3 - checkout | 7 |
weekly-newsletter | 4 - purchase | 115 |
retargetting-ad | 4 - purchase | 113 |
retargetting-campaign | 4 - purchase | 54 |
paid-search | 4 - purchase | 52 |
getting-to-know-cool-tshirts | 4 - purchase | 9 |
ten-crazy-cool-tshirts-facts | 4 - purchase | 9 |
interview-with-cool-tshirts-founder | 4 - purchase | 7 |
– 7. Looking at which campaigns were responsible for more touches at different points in the user’s journey, we can see which campaigns are more worthwhile to re-invest in. The top three campaigns for drawing users to the landing page and shopping cart: (1) interview with cooltshirts founder, (2) getting to know cooltshirts, & (3) ten crazy cooltshirts facts. And the top two campaigns for drawing users to checkout and to actually purchase: (4) weekly newsletter & (5) retargetting ad.
– Further analysis with usage funnels would be interesting to determine how users moved from one page to another across various campaigns.