In SQL, how can we get the average of only the unique values of a column?

Answer

To run the AVG() function on a column such that it only averages the unique values in the column, we could use the DISTINCT clause right before the column name.

Example

/* Returns 2.02365 */
SELECT AVG(price)
FROM fake_apps;
/* Returns 4.15833.. */
SELECT AVG(DISTINCT price)
FROM fake_apps;

Then, how does the data know what is the distinct value in Price Column? How do they differentiate unique value in that column and label it as a distinct value and then carry forward with the average caculation?

Hello yiwen. I’m not an expert but I hope I can help you.
when you run the following code:

SELECT price
FROM fake_apps;

the return is all the values in the column price. If you apply the AVG on it, you’ll have the total average. Nothing new here.

If you run the following code:

SELECT DISTINCT price
FROM fake_apps;

The return is all the distinct values on the column. Also, nothing new.

and so, the following:

SELECT AVG(DISTINCT price)
FROM fake_apps;

First the distinct will return the distinct ones and then the AVG is calculated based on that return. So the average is not calculated over the rows in price, but over the rows of the new table returned by the distinct.

I believe this would return the same result as SELECT AVG(price). AVG(price) as an aggregate function returns a single quantity/value, therefore there is only one value to apply to DISTINCT so it is unique.

It’s interesting that if the row value is NULL, that row is excluded from the AVG.
So in a rating column, if an app is not rated yet, it would not be treated as a 0. But in a price situation you would need to be sure to enter a price of 0 vs leaving free apps as NULL if you intended to do analytics of this nature.

I’ve been trying to display a list of each unique category, with the average price for apps in that category alone (this is different from average price of all apps in the table). Can you help?

i think that the purpose of the question is to calculate the average value of a criteria, so i can, example calculate the avg download of apps in catgory book:

@tranxuanmanh I think the purpose of the question is actually to identify unique values to average rather than just grouping by a category because in your example if the category books contains multiple books that gets no downloads (downloads = 0) then the multiple values would bring down the average you are calculating because the zeroes are counted multiple times. The grouping is based on the column you are averaging (in this case downloads) and not on anything else.

Admittedly, I think the use for this is quite limited. I think an example of the type of question this is trying to answer is like ‘what is the average number of downloads in the app store if each value of download is counted only once’ rather than just an average of total downloads.

@rinshad I suppose you could use a WHERE statement that somehow specify you only want one distinct value and ignoring any duplicates, but why would you do that when you achieve the same with just a single keyword in your code?