Last touch attribution (by timestamp or by attribution stage)

I have a query relating to last-touch attribution (somewhat more general than SQL) but it is in relation to task 6 of the Cool TShirts project- https://www.codecademy.com/paths/analyze-data-with-sql/tracks/analyze-data-sql-analyze-real-data/modules/analyze-data-sql-marketing-attribution/projects/analyze-data-sql-attribution-queries

Task 6: How many last touches on the purchase page is each campaign responsible for?

Am I correct in thinking that it’s typical to define last touch as the most recent interaction which led to the end of the attribution spectrum; in this particular case, the purchase page. This would be different to literally the most recent user interaction (e.g. by timestamp) of users who have ever reached the last step of attribution (since it ignores additional visits that didn’t result in the final step-purchase).

If anyone can confirm the standard it would be appreciated, I’ve been struggling to find a proper definition (assuming there is one).

Yes, that is my understanding. In regards to the project–was it email, google, NYT, Medium blog post that brought them back to the site to make a purchase.

I like this sports analogy of last-touch attribution:
https://medium.com/@GeoFli/what-is-last-click-attribution-e48cbe6d7f02

2 Likes

TL;DR: Yes.


That’s my understanding as well. And because of that, it’s super important to get the placement right for the WHERE clause in that project.

You’ve probably already seen it, but I’ve written up a super-long, deep-dive exploration into this project in the past.

Query 2 (last_touch2) in my example works correctly because the WHERE clause filters to the records where the page name is “4 - purchase” before grouping by the user ID and performing the MAX() aggregate on the timestamp.

Query 1 (last_touch1) leaves out three customers because it filters after the grouping and aggregating has already occurred, and those three users have a max timestamp that is not on the purchase page.

While 3 missing data points won’t really change the outcome of an analysis much, in the real world there would be many, many more people who come back to the website, but don’t purchase on their return. All of these people would be excluded from your results if you filtered by purchase page in the wrong place. This could, in turn, result in you suggesting (incorrectly) that a certain source or campaign was more effective than the others.

Of course, I’m not sure which would be worse: a) leaving out the customers whose last touch was not a purchase; or b) attributing their purchase to the source that led to their most recent (non-purchase) interaction with the website.

But I digress…sorry for the overly verbose answer! :sweat_smile:

2 Likes

Nope that’s kind of what I was looking for, thank you.

I did indeed get caught out with the the requirement of an additional WHERE statement (which is what led me to wonder exactly how last_touch is defined); I was starting at the numbers and wondering why they didn’t add up. It is mentioned in the hint but I don’t think it’s covered in the lesson; a cautionary tale for guessing the layout of your dataset I suppose. I’ve learnt a great deal more by coming unstuck than I might’ve done otherwise, sliver linings eh :grinning:.

1 Like

The WHERE is part of Q4, no? So, it comes before and you also use it in Q6, right?
That’s what I had waaaaay back when I did the project for the SQL Intensive when I took the class 3 years ago.

I feel like it’d be answer B.
But, that’s just like…my opinion, man.
:joy:

1 Like

What I meant was that I don’t believe it was covered in the lessons you run through before this project-
https://www.codecademy.com/paths/analyze-data-with-sql/tracks/analyze-data-sql-analyze-real-data/modules/analyze-data-sql-marketing-attribution/lessons/first-last-attribution/exercises/attr-query-iii

The WITH statement covered in that lesson only works if every user who reaches the purchase page has a final timestamp on that page. If they revisit the site and wind up on a different page after the purchase then that query doesn’t work entirely as expected.

@el_cocodrilo’s link covers that same issue though I think it’s only obvious in the project itself but perhaps it should be covered in the lessons since it’s easy to miss (or if you’re daft like me get confused by the definition instead :stuck_out_tongue:).

2 Likes

Ah, yep. Gotcha.
I don’t recall my entire thought process when I did the project. I actually did it for extra credit while the Warby Parker project was my main project that I submitted. Yea, I did all the projects b/c I’m that nerd. :joy: :nerd_face:

2 Likes

Not necessarily. The way it’s taught in the lesson, they show a “last-touch attribution query” that doesn’t filter it down to a particular page. Looks like @tgrtim just posted the link :joy:

In that exercise, you check for the individual by placing the WHERE outside of the CTE, so when you get to this project, I think it would be fair to assume it would work the same (spoiler alert: it doesn’t work the same).

I think the problem is twofold: 1) up until the project, it is assumed the last touch will come before or at the first purchase; and 2) people don’t tend to realize there’s an order of execution for SQL queries.

Combine that with the fact that the order of execution can vary slightly depending on the optimization algorithm of the RDBMS, and you have a big confusing mess for the few learners that actually catch the discrepancy here.

1 Like