Twitch SQL Gaming Data #12

Hi Folks,

I was working through my own scripts in Question 12 on Twitch Gaming Data, and wanted to create a table showing the total number of streams for each gaming genre, grouped by country.

Example result: In the US there were XX streams in the MOBA genre, XX streams in the FPS genre, XX streams in the Survival genre…In AU there were XX streams in the MOBA genre, XX streams in the FPS genre, etc.

I’m trying to use window functions with PARTITION BY to get the total count of streams by genre for each country in this script below. I copied the CASE statement from Question 7 in the same exercise to get the genres for each game. My issue is that when I run this script, the ‘genre_count’ always returns a value of 1 rather than the total number of streams in the genre (for that country):

SELECT country, genre, COUNT(*) OVER
(PARTITION BY country) AS genre_count
FROM (
SELECT game,
 CASE
  WHEN game = 'Dota 2'
      THEN 'MOBA'
  WHEN game = 'League of Legends' 
      THEN 'MOBA'
  WHEN game = 'Heroes of the Storm'
      THEN 'MOBA'
    WHEN game = 'Counter-Strike: Global Offensive'
      THEN 'FPS'
    WHEN game = 'DayZ'
      THEN 'Survival'
    WHEN game = 'ARK: Survival Evolved'
      THEN 'Survival'
  ELSE 'Other'
  END AS 'genre',
  country
FROM stream)
WHERE country IS NOT NULL
GROUP BY 1;

Am I thinking about this the right way? Is COUNT(*) OVER (PARTITION BY country) the right way to set up this window function?

The problem is using both PARTITION BY and GROUP BY in your query. PARTITION BY works similarly to GROUP BY, except that it keeps every single record, whilst GROUP BY only keeps one record per grouped variable. So what’s happening is you are partitioning by the country, and then grouping that also by country, effectively cancelling out the count as after the partition all the rows per country are the same.

I would recommend removing the partition completely as it doesn’t feel like an appropriate use in this instance, and instead just doing GROUP BY. Be aware that you will also need to group by genre if you want separate rows for genres, else the counts will just be the record counts for that country without accounting for genre.

2 Likes

I removed the PARTITION BY statement and instead ended the script with

GROUP BY country, genre;

That worked like a charm, thanks!

2 Likes