There are currently no frequently asked questions associated with this exercise – that’s where you come in! You can contribute to this section by offering your own questions, answers, or clarifications on this exercise. Ask or answer a question by clicking reply () below.
If you’ve had an “aha” moment about the concepts, formatting, syntax, or anything else with this exercise, consider sharing those insights! Teaching others and answering their questions is one of the best ways to learn and stay sharp.
Join the Discussion. Help a fellow learner on their journey.
Ask or answer a question about this exercise by clicking reply () below!
You can also find further discussion and get answers to your questions over in Language Help.
Agree with a comment or answer? Like () to up-vote the contribution!
PARTITION BY it’s like GROUP BY, but do calculation within that ‘group’ specified in PARTITION BY clause and it doesn’t reduce amount of rows in resulting table.
We get SUM() OVER (PARTITION BY within that ‘group’ - users ORDER BY month - for all available month in DESC or ASC order, sum for first month, sum for first and second months ect within ‘group’.).
If we remove ORDER BY month, we’ll get total SUM within that group in PARTITION BY for every record in table for particular user.
Adding to the frustration above, I can’t see how the query is actually answering the question.
I can see from the query results that arianagrande and cristiano are top two candidates for largest increase but is it percentage wise or absolute numbers?
I found the hint adding more confusion, ‘The steadiest flow of followers would be the user with the most consistent average’. How does the results help assessing ‘consistency’?
Would greatly appreciate some advice on this from the lesson writers.
It’s Step 5 in the Windows Functions - Partition By lesson
full path: Analyze Data with SQL Skill Path > Advanced SQL > Window Function > PARTITION BY
or
link here
The instruction reads: ‘Looking at the results of your query, which user had the largest/smallest increase in new followers? Which user had the steadiest flow of followers?’
The hint reads: ‘The steadiest flow of followers would be the user with the most consistent average.’
The code provided at ‘Get Unstuck’ section to answer the above questions is below. I did not write a query myself, as I’m confused about the question overall.
The code returns a table that looks very similar to the previous steps results. It only adds a column named ‘Posts’ that only adds to the confusion as I’m struggling to understand its relevancy for this step.
SELECT
username,
month,
change_in_followers,
SUM(change_in_followers) OVER (
PARTITION BY username
ORDER BY month
) 'running_total_followers_change',
posts,
AVG(change_in_followers) OVER (
PARTITION BY username
ORDER BY month
) 'running_avg_followers_change'
FROM
social_media;
It can definitely be confusing. So, you’re breaking it down by month. The question is getting at how many followers is someone gaining each month compared to the total number of followers? And, is the number (%) consistent month to month?
(as was stated above, PARTITION BY works similar to a GROUP BY–you’re grouping by month or, dividing the total number into groups but it doesn’t reduce the number of rows in the results (months in this case). In one of the questions you use GROUP BY user_name, which only gives you one month. (there are 8 months total). This isn’t what you want. So you use PARTITION BY.
So, it’s a longwinded way of getting the average number of users each month from the total number of users. it’s basically division. The “change_in_followers” is the % change and you want to see the user with the most consistency over all the months.
I did get what PARTITION BY does, no issue there. It’s the way Step 5 was worded and the solution provided that just threw me off tracks entirely. We’re given just that text and questions I’ve pasted in my second post, and… that’s it.
Your guesses and questions are as good as any, and valid.
Not sure what the numbers in ‘change_in_followers’ represent, if they are absolute or percentage, not sure if we’re supposed to get the biggest increase percentage wise or in absolute numbers and how the ‘steadiest flow of followers’ is defined, I just cannot understand. So many questions…
Would you know how we can get someone from Codecademy to look at it and advise?
IOW: what user had the most consistent numbers of new followers each month?
‘change_in_followers’ is a % change. the number that month/total. So, you have 22 new followers, what % is that of the total number of followers? the running_total just adds each month’s new number of followers.
(Also, in my version of the lesson I don’t see a “posts” column in the hint in step 4.)
No, sorry. I don’t know how to get someone from CC to help. I am merely a volunteer here on the forums. Perhaps other learners will chime in here with a better explanation.
If you want to submit a CQR report, for clarification for the lesson, you can do that.
Just re-did the whole section.
My take now is that Step 5 is not supposed to be answered with any code. Just look at the results table and understand the numbers.
I actually added ROUND to 2 decimals to the AVG change_in_followers and it looked way more readable, and it all clicked…
At this point I take the code with the ‘posts’ columns from ‘Get Unstuck’ section as obsolete.
Thanks again for all you help!
Case closed