FAQ: First- and Last-Touch Attribution - Review

This community-built FAQ covers the “Review” exercise from the lesson “First- and Last-Touch Attribution”.

Paths and Courses
This exercise can be found in the following Codecademy content:

FAQs on the exercise Review

There are currently no frequently asked questions associated with this exercise – that’s where you come in! You can contribute to this section by offering your own questions, answers, or clarifications on this exercise. Ask or answer a question by clicking reply (reply) below.

If you’ve had an “aha” moment about the concepts, formatting, syntax, or anything else with this exercise, consider sharing those insights! Teaching others and answering their questions is one of the best ways to learn and stay sharp.

Join the Discussion. Help a fellow learner on their journey.

Ask or answer a question about this exercise by clicking reply (reply) below!

Agree with a comment or answer? Like (like) to up-vote the contribution!

Need broader help or resources? Head here.

Looking for motivation to keep learning? Join our wider discussions.

Learn more about how to use this guide.

Found a bug? Report it!

Have a question about your account or billing? Reach out to our customer support team!

None of the above? Find out where to ask other questions here!

Hi all,

This exercise/lesson is relatively straightforward so I decided to take it a step further in the section review/practice.

Instead of having one table with last touch and UTM, I wanted to create a table that had first and last touch with corresponding UTMs for each in one combined table.

I managed to get to a point of having first and last touch time stamps with the corresponding first touch utm, but can’t seem to rejoin the tables again properly to include the last touch utm:

image

Below is my code:

I tried joining with the AND condition like in the example but had no luck. Does anyone have an idea of how this would be accomplished? I feel like it’s probably something simple that I’m overlooking.

Thanks

The key to achieve the task you have mentioned is to treat last touch and first touch attribute completely separately. in the final phase they should be joined.

In your code there might be conflict of utm_source that where it should added.

Please use below mentioned code. This will solve your query .

copy past below code in editor , you can see the pattern

WITH first_touch AS (
SELECT user_id,
MIN(timestamp) AS ‘first_touch_at’
FROM page_visits
GROUP BY user_id),

first_touch_source as (
SELECT first_touch.user_id,
first_touch.first_touch_at,
page_visits.utm_source
FROM first_touch
JOIN page_visits
ON first_touch.user_id = page_visits.user_id
AND first_touch.first_touch_at = page_visits.timestamp
),

last_touch as (
select user_id , max(timestamp) as ‘last_touch_at’
from page_visits
group by user_id),

last_touch_source as (
select last_touch.user_id,
last_touch.last_touch_at,
page_visits.utm_source
from last_touch
join page_visits
on last_touch.user_id = page_visits.user_id
and last_touch.last_touch_at = page_visits.timestamp
)

Select * from first_touch_source
left join last_touch_source
on first_touch_source.user_id = last_touch_source.user_id
limit 10;

Hi all,
What is the point of using JOIN in this exercise, if you could add the utm_source to the selection in the first place? As in:
SELECT user_id,
MIN(timestamp) AS ‘first_touch_at’,
utm_source
FROM page_visits
GROUP BY user_id;

Thanks

2 Likes

I would like to know someones thoughts on this as well. It seems like more work to build a temporary table, when you can just add another row/column selection.

Hey there!

Just like you, I asked myself the same question. I think its because by creating a temporary table you are going to be able to do other queries to the temporary table which you won’t be able to do if you only create a query to get straight results.

But I agree, we were able to query our results in much fewer lines!

Please if anyone knows a better answer to this, let me know!

This seemed right, after I spent time looking at your solution. I copied this into the code editor, but it doesn’t work. I hunted through your code hoping I would find a typo or misplaced comma, but I didn’t find anything. I only changed some commands written in lower case to upper case, just to help me read it better the way I’ve gotten used to reading it.
But it seems like it should work. If it doesn’t, I’m worried that I don’t understand something. Any ideas?
Thank you.

Hello guys, after I read the comments I realized that you were right. They is no point, or maybe the is, but I just can’t see it, to create two more tables only to get utm_source from the page_visits table, when we can just add one more column.
I used two temporary tables and it worked, this is the code:

WITH first_touch AS (
SELECT user_id,
MIN(timestamp) AS ‘first_touch_at’,
utm_source
FROM page_visits
GROUP BY user_id),

last_touch as (
SELECT user_id ,
MAX(timestamp) as ‘last_touch_at’,
utm_source
FROM page_visits
GROUP BY user_id)

SELECT *
FROM first_touch
LEFT JOIN last_touch
ON first_touch.user_id = last_touch.user_id
LIMIT 10;

Hi there,

Not sure if you came right but i figured out that it was because of the speech marks they used were wrong here:

MIN(timestamp) AS ‘first_touch_at’

and obviously here:

select user_id, max(timestamp) as ‘last_touch_at’

Some weird formatting issue i think.

Cheers!

I did the same thing but then wanted to get rid of the duplicated user_id in the output table. I solved it (see code below) but in the process was asking my self how I can adress certain columns that have the “same” name? Let’s say I created a temporary tabel (‘final’) that contains the output with the duplicated user_id using the following code:

WITH first_touch AS (
SELECT user_id, MIN(timestamp) AS first_touched, utm_source
FROM page_visits
GROUP BY 1
),
last_touch AS (
SELECT user_id, MAX(timestamp) AS last_touched, utm_source
FROM page_visits
GROUP BY 1
),
final AS(
SELECT *
FROM first_touch AS f
LEFT JOIN last_touch AS l
ON f.user_id =l.user_id
)
SELECT *
FROM final
LIMIT 10;

which leads to this output table:
grafik grafik

Assumend I wanted to just change the last piece of code aka the SELECT statement to only display user_id, first_touched, utm_source, last_touched, utm_source. How could I do that WITHOUT changing some other part of the code (e.g. rename the column earlier)? How do I adress the second utm_source column? It won’t work with utm_source:1 as the column is labeled in the output table? Why so? What happens when two tables are joined and both have a column with the same name - is there an easy workaround the rename duplicate columns (in performing a JOIN)?


In case anyone is interested, my final code does look like this:
WITH first_touch AS (
SELECT user_id, MIN(timestamp) AS first_touched, utm_source
FROM page_visits
GROUP BY 1
),
last_touch AS (
SELECT user_id, MAX(timestamp) AS last_touched, utm_source
FROM page_visits
GROUP BY 1
)
SELECT f.user_id, f.first_touched, f.utm_source, l.last_touched, l.utm_source
FROM first_touch AS f
LEFT JOIN last_touch AS l
ON f.user_id =l.user_id
LIMIT 10;

Wondering the same, but if this was the case we still could create 2 temporary tables like this without joining tables:

WITH first_touch AS (
   SELECT user_id,
    MIN(timestamp) AS 'first_touch_at',
    utm_source
   FROM page_visits
   GROUP BY user_id)
SELECT *
FROM first_touch;

 WITH last_touch AS (
   SELECT user_id,
    MAX(timestamp) AS 'last_touch_at',
    utm_source
   FROM page_visits
   GROUP BY user_id)
SELECT *
FROM last_touch;

Can any moderator bring some clarity about this? Are we missing something?

here you go:

WITH first_touch AS (
SELECT user_id, MIN(timestamp) AS “first_touch_at”, utm_source AS “first_touch_utm_source”
FROM page_visits
GROUP BY 1
),
last_touch AS (
SELECT user_id, MAX(timestamp) AS “last_touch_at”, utm_source AS “last_touch_utm_source”
FROM page_visits
GROUP BY 1
)
SELECT ft.user_id, ft.first_touch_at, ft.first_touch_utm_source, lt.last_touch_at, lt.last_touch_utm_source
FROM first_touch AS “ft”
JOIN last_touch AS “lt”
ON ft.user_id = lt.user_id;

In step 6 of the Attribution queries project : Attribution query project
If I put the WHERE CLAUSE FOR page_name in the last_touch temporary query, I got the correct answer , but if I put it in the SELECT statement below last_touch it gives 1 less for the values of 4 of the campaigns why the rest are correct. Can’t figure it out.

here is my code:

–STEP 6
WITH last_touch AS (
SELECT user_id, MAX(timestamp) AS “last_touch_at”, utm_campaign , page_name
FROM page_visits
WHERE page_name = “4 - purchase”
GROUP BY user_id
)
SELECT utm_campaign, COUNT(*) AS “campaign_purchases”
FROM last_touch
GROUP BY 1
ORDER BY 2 DESC;

This returns the correct answer for the number of purchases per campaign:

But when i move the WHERE clause down to the SELECT STATEMENT, it gives a different answer which I can’t account for:

Blockquote

1 Like

I have exactly the same question.

1 Like

Hi all. I am curious why we need to take so many steps - can we just do this instead?

SELECT user_id,
MAX(timestamp) AS ‘last_touch_at’,
utm_source,
utm_campaign
FROM page_visits
GROUP BY user_id ;

Wouldn’t this method still return us the last touch source and attribution based on the Max timestamp - and hence avoid the need to write long code using “With” and “Join”?