How can we get the average of unique values of a column?

Question

In SQL, how can we get the average of only the unique values of a column?

Answer

To run the AVG() function on a column such that it only averages the unique values in the column, we could use the DISTINCT clause right before the column name.

Example

/* Returns 2.02365 */
SELECT AVG(price)
FROM fake_apps;

/* Returns 4.15833.. */
SELECT AVG(DISTINCT price)
FROM fake_apps;
19 Likes

Then, how does the data know what is the distinct value in Price Column? How do they differentiate unique value in that column and label it as a distinct value and then carry forward with the average caculation?

6 Likes

Hello yiwen. I’m not an expert but I hope I can help you.
when you run the following code:

SELECT price
FROM fake_apps;

the return is all the values in the column price. If you apply the AVG on it, you’ll have the total average. Nothing new here.

If you run the following code:

SELECT DISTINCT price
FROM fake_apps;

The return is all the distinct values on the column. Also, nothing new.

and so, the following:

SELECT AVG(DISTINCT price)
FROM fake_apps;

First the distinct will return the distinct ones and then the AVG is calculated based on that return. So the average is not calculated over the rows in price, but over the rows of the new table returned by the distinct.

Hope it helps

60 Likes

Hello, can I calculate the average in price column of distinct values in another column, like name ? Thanks in advance.

4 Likes

This will give you your solution for average in price column of distinct values in Category column:

SELECT Category, AVG(price)
FROM fake_apps
GROUP BY category;

9 Likes

This can be achieved using WHERE statement right?

1 Like

can you please explain what
SELECT DISTINCT AVG(price) would return?
what is the difference between SELECT AVG(DICTINCT price) and the above one?

I believe this would return the same result as SELECT AVG(price). AVG(price) as an aggregate function returns a single quantity/value, therefore there is only one value to apply to DISTINCT so it is unique.

3 Likes

Have you tried testing it in the code editor?

@tera0271013690 is right:

SELECT DISTINCT AVG(price)
FROM fake_apps;

/* and also: */

SELECT AVG(price)
FROM fake_apps;

both yield the same result of 2.02365 since there is only one value issued anyway.

However:

SELECT AVG(DISTINCT price)
FROM fake_apps;

Yields a result of 4.15833333333333, since it takes into account only the distinct values before averaging them.

6 Likes

It’s interesting that if the row value is NULL, that row is excluded from the AVG.
So in a rating column, if an app is not rated yet, it would not be treated as a 0. But in a price situation you would need to be sure to enter a price of 0 vs leaving free apps as NULL if you intended to do analytics of this nature.

7 Likes

Very true. Great insight there. Thanks for sharing.

I’ve been trying to display a list of each unique category, with the average price for apps in that category alone (this is different from average price of all apps in the table). Can you help?

SELECT category ,avg( price)
from fake_apps
group by category;

HOPE THIS HELPS :slight_smile:

2 Likes

i think that the purpose of the question is to calculate the average value of a criteria, so i can, example calculate the avg download of apps in catgory book:

select avg(downloads)

from fake_apps

where category=‘Books’;

@tranxuanmanh I think the purpose of the question is actually to identify unique values to average rather than just grouping by a category because in your example if the category books contains multiple books that gets no downloads (downloads = 0) then the multiple values would bring down the average you are calculating because the zeroes are counted multiple times. The grouping is based on the column you are averaging (in this case downloads) and not on anything else.

Admittedly, I think the use for this is quite limited. I think an example of the type of question this is trying to answer is like ‘what is the average number of downloads in the app store if each value of download is counted only once’ rather than just an average of total downloads.

@rinshad I suppose you could use a WHERE statement that somehow specify you only want one distinct value and ignoring any duplicates, but why would you do that when you achieve the same with just a single keyword in your code?