Sharing my completed code from the User Churn project in the Data Science path for SQL. I had a hard time finding help for this project and wanted to make this available as it is a working code for anyone who gets stuck, but I also wanted to verify my calculations on churn as I am not 100% confident in the results.
If you have completed this and know the correct CASE for determining Active vs Canceled any assistance would be welcome.
Thank you for any help, and I hope this may assist others that find themselves stuck as I did repeatedly on this exercise.
WITH months AS
(SELECT
'2017-01-01' as first_day,
'2017-01-31' as last_day
UNION
SELECT
'2017-02-01' as first_day,
'2017-02-29' as last_day
UNION
SELECT
'2017-03-01' as first_day,
'2017-03-31' as last_day),
cross_join AS
(SELECT subscriptions.*, months.*
FROM subscriptions
CROSS JOIN months),
status AS
(SELECT id,
first_day AS 'month',
CASE
WHEN (subscription_start < first_day)
AND (subscription_end > first_day
OR subscription_end IS NULL)
AND (segment = '87') THEN 1
ELSE 0
END as is_active_87,
CASE
WHEN (subscription_start < first_day)
AND
(subscription_end between first_day AND
last_day)
AND (segment = '87') THEN 1
ELSE 0
END as is_canceled_87,
CASE
WHEN (subscription_start < first_day)
AND (subscription_end > first_day
OR subscription_end IS NULL)
AND (segment = '30') THEN 1
ELSE 0
END as is_active_30,
CASE
WHEN (subscription_start < first_day)
AND
(subscription_end between first_day AND
last_day)
AND (segment = '30') THEN 1
ELSE 0
END as is_canceled_30
FROM cross_join),
status_aggregate AS
(SELECT SUM(is_active_87) AS sum_active_87,
SUM(is_canceled_87) AS sum_canceled_87,
SUM(is_active_30) AS sum_active_30,
SUM(is_canceled_30) AS sum_canceled_30
FROM status)
SELECT (1.0 * sum_canceled_87) / (1.0 * sum_active_87) AS 'Seg_87_Churn',
(1.0 * sum_canceled_30) / (1.0 * sum_active_30) AS 'Seg_30_Churn'
FROM status_aggregate;