What happens if tables we perform UNION on have duplicate rows?

Question

What happens if the tables we perform the UNION operator on have duplicate rows?

Answer

When you combine tables with UNION, duplicate rows will be excluded.

To explain why this is the case, recall a Venn Diagram, which shows the relations between sets. If we perform UNION on two sets of data (tables), say A and B, then the data returned in the result will essentially be
A + B - (A intersect B)

In the first part,
A + B
will add together all the rows of both tables, including duplicates.

The second part,
- (A intersect B)
will remove every duplicate, which is where A and B intersected.

If, however, you wanted to include duplicates, certain versions of SQL provides the UNION ALL operator.

22 Likes

What happen if we use union function in below mentioned tables (table 1 and 2).

Since name in table 2 and table 1 is same , however all other details are not same. So will union function delete the table 2 value?

Capture

4 Likes

Union of these two tables:

SELECT * FROM "Table1"
UNION
SELECT * FROM "Table2"

will result in:

id		Name	Address
1		A		D
2		B		E
3		C		F
5		A		G

UNION will exclude any duplicated rows. But row with id 5 is unique, no other row has name "A" and address "G".

8 Likes

Does that mean in order to remove duplicate values , union will only remove completely identical rows. Even if one entry of row is not same as some other row, it will be added on.

2 Likes

Yes :slight_smile:

But it’s important to note that the selected row is not always the same thing as the whole row. For example, let’s say that we want to get only the names. In the example above we got 4 results. This is our new query:

SELECT Name FROM "Table1"
UNION
SELECT Name FROM "Table2"

the result is:

Name
A
B
C

Only 3 results, why? Because two rows had an identical value in the column Name - A and because this was the only queried column - those rows were duplicates. It does not matter that those two rows in the database have different values in column id and Address - because these columns were not part of the selection.

23 Likes

What will happen to the primary key(like id ) of both the tables if their primary keys coincide but the values in rest of the columns are different? For example in your example if the id in Table2 was 1 instead of 5 what would have happened?

6 Likes

Primary Keys are unique by definition - they cannot be duplicate :smile: So that column would just be a normal column (like the name column) and not a primary keys column…

1 Like

Hey @factoradic can you give some insights ?