FAQ: Why Data Science? - Exploring Data with SQL - Continued

This community-built FAQ covers the “Exploring Data with SQL - Continued” exercise from the lesson “Why Data Science?”.

Paths and Courses
This exercise can be found in the following Codecademy content:

Data Science

FAQs on the exercise Exploring Data with SQL - Continued

Join the Discussion. Help a fellow learner on their journey.

Ask or answer a question about this exercise by clicking reply (reply) below!

Agree with a comment or answer? Like (like) to up-vote the contribution!

Need broader help or resources? Head here.

Looking for motivation to keep learning? Join our wider discussions.

Learn more about how to use this guide.

Found a bug? Report it!

Have a question about your account or billing? Reach out to our customer support team!

None of the above? Find out where to ask other questions here!

how much time it takes to load “pro users” query in SQL third exercise? It’s taking hours for me and I don’t know if this is normal.

1 Like

Is there something I have to do to keep the lesson? It only says I have to click “run” buttom.

1 Like

Why not use the variables in the calculations rather than extract the values again?

No, it took me like three seconds to run the data. It must be your internet problem.

There is nothing in the instructions to tell me what to run in the code editor, nor is there anything there. The subsequent exercises aren’t working either. I’ve logged out and back in. I’ve restarted the program, but nothing is working. Does anyone know what I’m doing wrong? I feel like I’m getting nothing out of this, and wasting a ton of time. Please help!

Thanks :slight_smile:

1 Like

Try closing your current browser, and start a new Chrome window to launch Codecademy for your lesson. I’ve seen this before, restarting the same lesson in a new Chrome browser fixed it.

1 Like

I switched to Chrome, and that solved the error. Thanks!

I was using Chrome, and it was working perfect until now. I cannot see the result from the query :frowning:
I did everything I could, and still can’t see it.

1 Like

Why are they not explaining what the code does?? :frowning: will they do it later?

9 Likes

This looks wrong. If this wants the churn rate in march I think the end of the query should be

(cancel_date IS NULL) OR
(cancel_date >= '2017-03-01')

);

NOT cancel_date > ‘2017-03-01’. This would exclude where cancel_date = ‘2017-03-01’ which is valid. Am I right?

1 Like

what is strftime in this code and what does %m denotes?

Can anyone explain the syntax of this example where Catherine analyzes churn rate for March 2017?

1 Like

Please can someone explain what the code means?

1 Like

Can someone copy the query here? the editor is empty for me… I cannot see anything - thank you

Someone please correct me if I’m wrong:
“strftime” is the method of returning a certain portion of a date value, and %m is specifying which portion of the date value ( %m = month ) to return.
So in this example
– WHERE strftime("%m", cancel_date) = ‘03’ – would mean in a value like 2017-03-01 we’re telling the query to look at the month portion of the date string… and if it is = to 03 THEN take the user ID from that row and count it –

2 Likes

HERE IS THE QUERY AS IT APPEARS TO ME IN MY SESSION:

SELECT COUNT(DISTINCT user_id) AS ‘enrollments’,
COUNT(CASE
WHEN strftime("%m", cancel_date) = ‘03’
THEN user_id
END) AS ‘march_cancellations’,
ROUND(100.0 * COUNT(CASE
WHEN strftime("%m", cancel_date) = ‘03’
THEN user_id
END) / COUNT(DISTINCT user_id)) AS ‘churn_rate’
FROM pro_users
WHERE signup_date < ‘2017-04-01’
AND (
(cancel_date IS NULL) OR
(cancel_date > ‘2017-03-01’)
);

To the best of my understanding (I will change the formatting a little to help me explain):

In the “SELECT” portion we are selecting the following points of data
-The count of Enrollments
-The count of March Cancellations
-Churn Rate

SELECT
COUNT(DISTINCT user_id) AS ‘enrollments’, <— here is where we are selecting enrollments. We are doing this by counting distinct(unique) user_ids.

COUNT(CASE
WHEN strftime("%m", cancel_date) = ‘03’
THEN user_id
END) AS ‘march_cancellations’, <— here is where we are selecting march cancellations. we are doing this by looking at the cancel date, and wherever the month (%m) of the date is =03 we are returning the user_id, and then counting up those IDs

ROUND(100.0 * COUNT(CASE
WHEN strftime("%m", cancel_date) = ‘03’
THEN user_id
END) / COUNT(DISTINCT user_id)) AS ‘churn_rate’ <— here we are selecting the churn rate. We do this by (same method as "March Cancellations) multiplying the count of March cancellations by 100, and then dividing that number by the total number of count of user ids(or “enrollments”) which gives us the churn rate.

FROM pro_users <—here we specify we are pulling the data we selected for above from the “pro_users” table

WHERE <—here we show the specific criteria for a record to meet in order for us to return it.
signup_date < ‘2017-04-01’ <— we specify that signup date of record must be “less than” 2017-01-01"
AND ( <— AND
(cancel_date IS NULL) OR <— it the cancel date of record must be EITHER null (empty/blank) OR
(cancel_date > ‘2017-03-01’) “greater than” 2017-03-01"
);

Hope this is clear enough, and is a help.

3 Likes

Thank you, but the query you shared is returning no results for me…

The reason your query didn’t work was because somehow my laptop read the quotations as apostrophes… when changing ’ with ’ it works