SQL doing joins; redundant ID#


When doing the lesson on inner joins,
I couldn’t get over the uncomfortable fact that the inner join diagram work is different from the actual SQL commands

The diagram in the instruction suggests that

| 1 | A |
| 2 | B |
| 3 | C |

combined with

| A | 2 |
| D | 2 |
| C | 3 |

would result in

| 1 | A | 2 |
| 3 | C | 3 |

but in the case of the problem for the inner joins
it seems to generate

| 1 | A | A | 2 |
| 3 | C | C | 3 |

where the IDs compared through ON keyword after JOIN seems to be repeated
(Question 1) how do I alter the code so that the redundant ID does not appear on the resulting table?

FROM newspaper
JOIN online
ON newspaper.id = online.id;
results in

where you could see that the ‘id’ column is repeated

(Question 2) Or if the ID was to be the last column on newspaper (the table on the leftside)
would the ID column not be repeated?

(Question 3) After overlooking the other exercises a bit, I realize that left ID is newspaper.id while right ID is online.id but is there a purpose in keeping both of them? (if one is not supposed to be removed);

The column you’re using to join the tables are different.
They might look like they’re being repeated, but they are part of different tables.
The example shown below is correct record wise:

However, data wise (especially if you join and SELECT *, ALL of the columns from both tables will be selected.
So if you have these tables:

Columns: TableA_ID, Name, DateAdded.

Columns: TableB_ID, Type, Description.

And you JOIN them like this:

    INNER JOIN TableB B ON B.TableB_ID = A.TableA_ID

The result will include all of the selected columns:

TableA_ID, Name, DateAdded, TableB_ID, Type, Description

If you don’t want to select all of the columns from both tables, you have to write it something like this:

SELECT A.TableA_ID, A.Name, A.DateAdded, B.Type, B.Description
    INNER JOIN TableB B ON B.TableB_ID = A.TableA_ID

Your last question (Question 3) has a very generic answer: It depends.

Usually there won’t be a use for the second ID column (I can’t think of a case where it is needed, I might be wrong though?).

It just takes longer to write out all of the columns you want (instead of just *), especially if you just want to write a quick query to see the result.
On the other hand, it would be beneficial to write out all of the columns you think are relevant if you want to display the results on a dashboard for example, or if it is a query that gets executed a lot over a network, it might save a tiny amount of bandwidth (and data if over the internet) to select only what you need.