Division in SQL with SUM()

Hello,

I am following this task

https://www.codecademy.com/paths/data-science/tracks/sql-intermediate/modules/dspath-sql-aggregates-projects-and-applications/projects/sql-hackernews

Number 4.

Then, we want to add these users’ score s together and divide by the total to get the percentage.

Add their scores together and divide it by the total sum. Like so:

SELECT (1.0 + 2.0 + 3.0) / 6.0;

So, is Hacker News dominated by these users?

HERE IS THE HINT: SELECT (517 + 309 + 304 + 282) / 6366.0;

I do not want to manually type the numbers for the division so I used these syntax to get Sum
of Score from the group by

WITH each_s AS ( SELECT user, SUM(score) as each_score

FROM hacker_news

GROUP BY user

HAVING each_score > 200), TOTAL AS( SELECT SUM(score) FROM hacker_news)

SELECT SUM(each_score) FROM each_s;

The final syntax gives: 1412 is correct

Now I need to take 1412 to divide for Total score (6366) BUT I do not want to manually type 6366 in

Syntax to get the total is SELECT SUM(score) FROM hacker_news;

Is there anyway to do this math automatically and got 22% as result INSTEAD OF typing 517 + 309 + 304 + 282 and 6366?

my code: https://gist.github.com/4afce3b27f48aad5fd3a92ea02e8dc96

Hi @davidvietnam, welcome to the forums!

One way to accomplish what you’re trying to do would be to use subqueries. Subqueries are just regular queries that are nested inside a specific clause of your main query.

Here, we would want to avoid using this phrase:

TOTAL AS( SELECT SUM(score) FROM hacker_news)

…as this creates a Common Table Expression (or CTE), which is a type of temporary table. Because we only want the value and not a table with the value, we should not use the WITH...AS syntax to retrieve the total. Instead, what we can do is use a subquery inside of the in the SELECT clause:

WITH each_s AS (
  SELECT user, SUM(score) as each_score
  FROM hacker_news
  GROUP BY user
  HAVING each_score > 200
)
SELECT SUM(each_score) / (SELECT SUM(score) * 1.0 FROM hacker_news) FROM each_s;

If you are interested in learning more about subqueries, Codecademy has a short course on them here:
https://www.codecademy.com/learn/sql-table-transformation

This course probably needs to be re-worked to be more learner-friendly, but it should get you started.

P.S., in the future if you post your code here in the forums, please use this button to format it so that it’s easier for all to read:
capture

After pressing that button, you will see a prompt that looks like this appear in your editor:
capture_r

Just copy and paste your code directly from the editor into that highlighted portion, and all of the nicely-readable formatting will be preserved! Moreover, it will be in a format that can be directly copied and pasted into an IDE for testing.

Happy coding!

I had a similar idea to calculate the percent properly, rather than hard-coding the calculation.

Here’s my first attempt, which failed:

WITH top_users as (
  SELECT user, score as "top_scores"
  FROM hacker_news
  GROUP BY user
  HAVING score > 200
)
SELECT 
  -- SUM(top_scores) as "Top Scores", 
  -- SUM(hacker_news.score) as "Sum All Scores",
  -- SUM(top_scores) * 1.0 / SUM(hacker_news.score) as "Top Scores percent"
FROM hacker_news, top_users;

I’m still not sure why this doesn’t work.
The calculated figure gives the right answer - just exactly 1000 times too big. The first column is exactly 4000 larger than expected. The second is 4 times the expected value. I figured the 4000 number is because there are 4000 rows in the whole table but why it’s counting all of them, I don’t know. Because it looks like it should only pull the four scores greater than 200, just like how it worked in the earlier questions. I could fix it with divide by 1000 but that’s also cheating :frowning:

Having checked this forum topic, here’s my working solution:

SELECT 
      --  SUM(top_scores) as "Top Scores", 
      --  (SELECT SUM(score) as "Sum All Scores" FROM hacker_news),
       ROUND(SUM(top_scores) * 100.0 / (SELECT SUM(score) as "Sum All Scores" FROM hacker_news),0) as "percent"
FROM top_users;

The ROUND and * 100 are just there to make it look nice. Without them, the main division calculation yields 0.22… which is correct.