RPA Customer Segmentation Project Question

Hi,
In the comment section of the project there is a challenge to find users birthday when they sign up.

https://www.codecademy.com/paths/analyze-data-with-sql/tracks/analyze-data-sql-query-data/modules/analyze-data-sql-practice-queries/projects/user-segmentation-prj

My first thought was to subtract sign up date and their birthday. However, this is incorrect for some of the days.
The first person was born on 1982-04-17 and created their account on 2017-01-26, making him 34. However, the subtraction makes him 35. How else can I query this taking into account that the column data type is text.

SELECT birthday,created_at,(created_at-birthday)
AS Age FROM users;
1 Like

Kudos on noticing the difference between the way ages are returned with the simple query and the way human age is actually used. There’s a bit of discussion here with some potentially useful info- Adding An Age Column with Values Based on Birthdays in Another Column

You’ll have to check the docs for the full details but I believe CAST(expression AS INT) truncates towards zero. Unfortunately I don’t think using the - operator would allow you to make use of this. You’ll likely need to make use of one of the datetime functions strftime, julianday or similar to perform the calculation and then you can truncate the result.

2 Likes

Thank you responding. I don’t see how I could use strftime, but using julianday did in fact work.
Upon subtracting created_at and birthday as juliandays, we’re left with the age as number of days. When we divide it by 365 we receive a decimal. However, the number is accurate. From there I casted it as an INT to floor the value.
Thank you for the help!
My final query looked like this in case people come across this post in the future.

SELECT birthday,created_at,CAST((julianday(created_at) -julianday(birthday)) / 365 as int) AS age FROM users;
1 Like

I think it’s much the same, something like strftime('%s', created_at) would work but instead of 365 you’d need to consider the number of seconds in a year or strftime('%Y.%m%d', created_at) would probably work (a very odd way to do it) but I think it’d be valid without an additional division though both still rely on the CAST to floor it.

The only one I wonder about is that situation would be the effect of leap years and that’s more because I’m curious and I’ve never tried it before.