This community-built FAQ covers the “The Attribution Query III” exercise from the lesson “First- and Last-Touch Attribution”.
Paths and Courses
This exercise can be found in the following Codecademy content:
FAQs on the exercise The Attribution Query III
There are currently no frequently asked questions associated with this exercise – that’s where you come in! You can contribute to this section by offering your own questions, answers, or clarifications on this exercise. Ask or answer a question by clicking reply () below.
If you’ve had an “aha” moment about the concepts, formatting, syntax, or anything else with this exercise, consider sharing those insights! Teaching others and answering their questions is one of the best ways to learn and stay sharp.
Join the Discussion. Help a fellow learner on their journey.
Ask or answer a question about this exercise by clicking reply () below!
Agree with a comment or answer? Like () to up-vote the contribution!
Can someone please explain why we need to create a new table then join it to the old one instead of adding utm_source to the initial query that is grouped by user_id and includes max(timestamp) ?
I think I get it now, though. Don’t think they actually give the same result, because first you need to link the first touches to user ID and later to campaign. You can’t do both at the same time so need to create a first_touch table. Try it in the project and I think you’ll understand.
the first query you gave does not work as intended. You need the create a temporary first_touch table if you want to get the utm_source along with the fist time stamp for each user.
I have the same question. For the sake of clarity, the solution according to the exercise should be
WITH last_touch AS (
SELECT user_id,
MAX(timestamp) AS 'last_touch_at'
FROM page_visits
GROUP BY user_id)
SELECT lt.user_id,
lt.last_touch_at,
pv.utm_source
FROM last_touch AS 'lt'
JOIN page_visits AS 'pv'
ON lt.user_id = pv.user_id
AND lt.last_touch_at = pv.timestamp;
But this other query gives the same result, without creating a temporary table to JOIN.
SELECT user_id,
MAX(timestamp) AS 'last_touch_at',
utm_store
FROM page_visits
GROUP BY user_id;
I compared the results given by the two queries and they were identical. So what is the difference?
My guess, but I am not sure about this, is that with a user that has multiple entries in the table with different values for utm_source, the short query might not return the utm_source that corresponds to the correct last_touch_at.
select user_id,timestamp,utm_source from page_visits
group by user_id
having timestamp = max(timestamp);
will this query work against
WITH last_touch AS (
SELECT user_id,
MAX(timestamp) AS 'last_touch_at'
FROM page_visits
GROUP BY user_id)
SELECT lt.user_id,
lt.last_touch_at,
pv.utm_source
FROM last_touch AS 'lt'
JOIN page_visits AS 'pv'
ON lt.user_id = pv.user_id
AND lt.last_touch_at = pv.timestamp
Hi , can anyone explain to me how SQL will read the below code ?As in what sequence? I know seems like a basic question at this stage . Thanks!!
WITH first_touch AS (
SELECT user_id,
MIN(timestamp) AS ‘first_touch_at’
FROM page_visits
GROUP BY user_id)
SELECT ft.user_id,
ft.first_touch_at,
pv.utm_source
FROM first_touch AS ‘ft’
JOIN page_visits AS ‘pv’
ON ft.user_id = pv.user_id
AND ft.first_touch_at = pv.timestamp;
SELECT user_id ,MAX(timestamp),utm_source FROM page_visits
WHERE user_id = 10069;
WHY NEED THIS LONG SQL BELOW?
WITH last_touch AS (
SELECT user_id,
MAX(timestamp) as last_touch_at
FROM page_visits
GROUP BY user_id)
SELECT lt.user_id,
lt.last_touch_at,
pv.utm_source
FROM last_touch lt
JOIN page_visits pv
ON lt.user_id = pv.user_id
AND lt.last_touch_at = pv.timestamp
WHERE lt.user_id = 10069;
SELECT user_id ,MAX(timestamp),utm_source FROM page_visits
WHERE user_id = 10069;
WITH last_touch AS (
SELECT user_id,
MAX(timestamp) as last_touch_at
FROM page_visits
GROUP BY user_id)
SELECT lt.user_id,
lt.last_touch_at,
pv.utm_source
FROM last_touch lt
JOIN page_visits pv
ON lt.user_id = pv.user_id
AND lt.last_touch_at = pv.timestamp
WHERE lt.user_id = 10069;
I did a query to obtain both attributions in the same query. My issue here is that i will not get the utm_source for the last_touch_at, it will only show the first_touch’s utm_source.
How could i edit my query so it will show the different utm_source’s that X user_id registered.
"
WITH attribution AS (
SELECT user_id, MIN(timestamp) AS ‘first_touch_at’, MAX(timestamp) as ‘last_touch_at’
FROM page_visits
GROUP BY user_id)
SELECT a.user_id,
a.first_touch_at
a.last_touch_at,
pv.utm_source
FROM attribution AS ‘a’
JOIN page_visits AS ‘pv’
ON a.user_id = pv.user_id
AND a.first_touch_at = pv.timestamp
AND a.last_touch_at = pv.timestamp;
I would imagine because since there are two user_ids, the system needs you to specify which one you are referring to, even though they are referring to the same records.
I have the same question as most of the people here. What’s the reason behind writing a long code with joins when the same question can be answered by MAX(timestamp) and group by user_id? I couldn’t find any answers here or did I miss it out somewhere?
Question regarding the data itself. The below query is not correct because some users have multiple landing page entries. What causes this? Ive looked at the data and some of these users have two landing page entries in a row (like user_id = 59264)
Hi, I also have the same question as you.
And I don’t think they will have any change even building temp table, because,t in last_touch, only one latest record will be extracted in each user due to the ‘groupby user_id’.
I am not sure… but it will be great if anyone can tell us the purpose of this method.