This community-built FAQ covers the “Calculating Churn Review” exercise from the lesson “Calculating Churn”.
Paths and Courses
This exercise can be found in the following Codecademy content:
FAQs on the exercise Calculating Churn Review
There are currently no frequently asked questions associated with this exercise – that’s where you come in! You can contribute to this section by offering your own questions, answers, or clarifications on this exercise. Ask or answer a question by clicking reply () below.
If you’ve had an “aha” moment about the concepts, formatting, syntax, or anything else with this exercise, consider sharing those insights! Teaching others and answering their questions is one of the best ways to learn and stay sharp.
Join the Discussion. Help a fellow learner on their journey.
Ask or answer a question about this exercise by clicking reply () below!
Agree with a comment or answer? Like () to up-vote the contribution!
Hi,
I was wondering how to create an is_active status by department(segment). I was able to make the is_active, is canceled columns but cant figure out how to do it by department without making the whole thing crash.
I noticed at the end of this lesson that I got slightly different churn rates for each month than what the final solution has. The reason turned out to be that we defined is_active differently.
The solution suggests
case
when subscription_start < first_day
and (subscription_end > first_day
or subscription_end is null)
then 1
else 0
end as is_active
while I used
case
when subscription_start < first_day
and (subscription_end >= first_day
or subscription_end is null)
then 1
else 0
end as is_active
Should we not include subscription_end = first_day to count as active at the start of the month? Considering is_cancelled is defined by
case
when subscription_end between first_day and last_day
then 1
else 0
end as is_cancelled
Would this not mean that a person cancelling their subscription on the first day of the month will be registered as a cancellation in the month (is_cancelled = 1), but not active (is_active = 0)? That makes very little logical sense to me., but perhaps I am missing something?
It seems that SQL only requires you use the WITH statement once if you’re creating a series of ‘temporary tables’ separated by commas - maybe it’s for convenience or simplicity sake? It would be slightly cumbersome to type 'WITH table1 AS (…), WITH table2 AS (…), WITH table3 AS (…), etc. rather than just going WITH table1 AS (…), table 2 AS (…)
Just have a small question about the churn rate formula itself.
In the beginning of the lecture, the churn rate is basically the number of subscribers that cancelled divided by the total number of subscriber in a period.
But, towards the end of the project, the monthly churn rate is calculated by ‘cancel divided by active’. As per my understanding, active subscribers do not represent the total subscribers in that period. Instead, it represents some portion of the total subscribers.
I’m not sure I understand your question, but I think active subscribers do in fact represent the total subscribers in that period. That period being one month. Total subscribers in any of the tables would represent any subscriber active for any of the months covered by the original subscriptions table.
I don’t believe you could start and end a subscription on the same day. So there should be no records where a person canceled on the first day but was inactive. They will have a start date before that day, maybe the first day of the previous month. In that case, they will be inactive at the start of the previous month, but active for the month of cancelation. This query is specifically designed for one month minimum subscriptions. I’m guessing that to allow for less than one month subscriptions, monthly churn rate would have to be defined slightly differently.
Can anyone tell me why this doesn’t return an answer?
WITH months AS
(SELECT ‘2017-01-01’ AS first_day,
‘2017-01-31’ AS last_day
UNION
SELECT ‘2017-02-01’ AS first_day,
‘2017-02-28’ AS last_day
UNION
SELECT ‘2017-03-01’ AS first_day,
‘2017-03-31’ AS last_day
),
cross_join AS
(SELECT *
FROM subscriptions
CROSS JOIN months
),
status AS
(SELECT id,
first_day AS month,
CASE
WHEN
(subscription_start < first_day)
AND (subscription_end >= first_day
OR subscription_end IS NULL)
THEN 1
ELSE 0
END AS is_active,
CASE
WHEN
(subscription_end BETWEEN first_day AND last_day)
THEN 1
ELSE 0
END AS is_canceled
FROM cross_join
)
SELECT *
FROM status
LIMIT 50
WOW! Thanks. So when I cut and pasted the code from Github, somehow the format of the quotations changed. It’s obvious now, because the Codecademy workspace shows the dates in red (as integers or dates I suppose) as opposed to yellow where it’s treated more like a string. If I was trying to copy code just into sqlite or something, I don’t know how I’d catch that. Valuable lesson. Thanks again.
I came here to write the same thing - I think your implementation is correct. In the final answer you can actually find examples in the table status where is_active=0 and is_cancelled = 1, and this doesn’t make sense since the churn rate is the number of active users in that month who cancel in that month, divided by the number of active users in that month.
Conversely, if you could have is_active = 0 and is_cancelled = 1, then it would be possible to have a churn_rate above 1. For instance, with 2 users that cancel but only one that is active you’d have churn_rate = 2 / 1 = 200%, which is nonsensical.