FAQ: Window Functions - PARTITION BY

This community-built FAQ covers the “PARTITION BY” exercise from the lesson “Window Functions”.

Paths and Courses
This exercise can be found in the following Codecademy content:

Analyze Data with SQL

FAQs on the exercise PARTITION BY

There are currently no frequently asked questions associated with this exercise – that’s where you come in! You can contribute to this section by offering your own questions, answers, or clarifications on this exercise. Ask or answer a question by clicking reply (reply) below.

If you’ve had an “aha” moment about the concepts, formatting, syntax, or anything else with this exercise, consider sharing those insights! Teaching others and answering their questions is one of the best ways to learn and stay sharp.

Join the Discussion. Help a fellow learner on their journey.

Ask or answer a question about this exercise by clicking reply (reply) below!
You can also find further discussion and get answers to your questions over in Language Help.

Agree with a comment or answer? Like (like) to up-vote the contribution!

Need broader help or resources? Head to Language Help and Tips and Resources. If you are wanting feedback or inspiration for a project, check out Projects.

Looking for motivation to keep learning? Join our wider discussions in Community

Learn more about how to use this guide.

Found a bug? Report it online, or post in Bug Reporting

Have a question about your account or billing? Reach out to our customer support team!

None of the above? Find out where to ask other questions here!

STEP 4,

So far so good until this step…

Why did we introduce the ‘posts’ column? (in the hint)

What does that help? Why does that add to this table? WHY would it get put back into the SELECT criteria?

… uuuuuuhhhnoyed…

5 Likes

Agreed. Can the lesson writers change this?

2 Likes

What exactly is ‘PARTITION BY’ doing in the processing of the table and how is it a subclause of OVER?

PARTITION BY it’s like GROUP BY, but do calculation within that ‘group’ specified in PARTITION BY clause and it doesn’t reduce amount of rows in resulting table.
We get SUM() OVER (PARTITION BY within that ‘group’ - users ORDER BY month - for all available month in DESC or ASC order, sum for first month, sum for first and second months ect within ‘group’.).
If we remove ORDER BY month, we’ll get total SUM within that group in PARTITION BY for every record in table for particular user.
Снимок экрана 2021-12-29 в 17.14.34

1 Like

Adding to the frustration above, I can’t see how the query is actually answering the question.

I can see from the query results that arianagrande and cristiano are top two candidates for largest increase but is it percentage wise or absolute numbers?

I found the hint adding more confusion, ‘The steadiest flow of followers would be the user with the most consistent average’. How does the results help assessing ‘consistency’?

Would greatly appreciate some advice on this from the lesson writers.

Thanks,

1 Like

What step? It might be a better idea to post your (formatted–use the “</>” button) code and what issue(s) you’re having with it.

Hi Lisa,

Thanks so much for replying,

It’s Step 5 in the Windows Functions - Partition By lesson

full path: Analyze Data with SQL Skill Path > Advanced SQL > Window Function > PARTITION BY
or
link here

The instruction reads: ‘Looking at the results of your query, which user had the largest/smallest increase in new followers? Which user had the steadiest flow of followers?’

The hint reads: ‘The steadiest flow of followers would be the user with the most consistent average.’

The code provided at ‘Get Unstuck’ section to answer the above questions is below. I did not write a query myself, as I’m confused about the question overall.

The code returns a table that looks very similar to the previous steps results. It only adds a column named ‘Posts’ that only adds to the confusion as I’m struggling to understand its relevancy for this step.

SELECT 
    username,
    month,
    change_in_followers,
    SUM(change_in_followers) OVER (
      PARTITION BY username 
      ORDER BY month
    ) 'running_total_followers_change',
    posts,
    AVG(change_in_followers) OVER (
      PARTITION BY username 
      ORDER BY month
    ) 'running_avg_followers_change'
FROM
    social_media;

It can definitely be confusing. So, you’re breaking it down by month. The question is getting at how many followers is someone gaining each month compared to the total number of followers? And, is the number (%) consistent month to month?
(as was stated above, PARTITION BY works similar to a GROUP BY–you’re grouping by month or, dividing the total number into groups but it doesn’t reduce the number of rows in the results (months in this case). In one of the questions you use GROUP BY user_name, which only gives you one month. (there are 8 months total). This isn’t what you want. So you use PARTITION BY.

So, it’s a longwinded way of getting the average number of users each month from the total number of users. it’s basically division. The “change_in_followers” is the % change and you want to see the user with the most consistency over all the months.

Thanks for coming back on this.

I did get what PARTITION BY does, no issue there. It’s the way Step 5 was worded and the solution provided that just threw me off tracks entirely. We’re given just that text and questions I’ve pasted in my second post, and… that’s it.

Your guesses and questions are as good as any, and valid.

Not sure what the numbers in ‘change_in_followers’ represent, if they are absolute or percentage, not sure if we’re supposed to get the biggest increase percentage wise or in absolute numbers and how the ‘steadiest flow of followers’ is defined, I just cannot understand. So many questions…

Would you know how we can get someone from Codecademy to look at it and advise?

IOW: what user had the most consistent numbers of new followers each month?
‘change_in_followers’ is a % change. the number that month/total. So, you have 22 new followers, what % is that of the total number of followers? the running_total just adds each month’s new number of followers.
(Also, in my version of the lesson I don’t see a “posts” column in the hint in step 4.)

No, sorry. I don’t know how to get someone from CC to help. I am merely a volunteer here on the forums. Perhaps other learners will chime in here with a better explanation.

If you want to submit a CQR report, for clarification for the lesson, you can do that.

Just re-did the whole section.
My take now is that Step 5 is not supposed to be answered with any code. Just look at the results table and understand the numbers.
I actually added ROUND to 2 decimals to the AVG change_in_followers and it looked way more readable, and it all clicked…
At this point I take the code with the ‘posts’ columns from ‘Get Unstuck’ section as obsolete.
Thanks again for all you help!
Case closed

2 Likes

In re the video being out of sync, you could always submit a course correction via here.