Natural join
The natural join operation operates on two relations and produces a relation as a result.
It matches tuples with the same values for all common attributes, and retains only one copy of each common attribute.
Unlike the Cartesian product of two relations, natural join considers only those pairs of tuples with the same value on those attributes that appear on the schema of both relations.
It is an equi-join which occurs implicitly by comparing all the same name columns in both relations.
Join conditions
To prevent equating attributes erroneously, SQL supports the specification of the join condition.
USING
construct
The operation JOIN ... USING
requires a list of attribute names to be specified.
r_i
: RelationA_i
: Attribute
The operation is equivalent to:
It is an equi-join, and causes duplicate attributes to be removed from the resultset.
ON
construct
The ON
condition allows a general predicate over the relaions being joined.
This predicate is written like a WHERE
clause predicate.
r_i
: Relationp
: Predicate The operation is equivalent to:
It is a theta join, and it allows duplicate attributes to appear in the resultset.
Example
- All attributes about all students, along with all the courses that they have took
- Note that the above query has two occurrences for the attribute
ID
, and the relation name of origin must be specified to disambiguate the two.
Comparison of ON
and USING
In a nutshell, ON
can be used for most joins, but USING
is a handy shorthand for the situation where the column names are the same.
The following queries are equivalent.
Outer join
The outer join is an extension of the join operation that avoids loss of information.
It computes the join operation, and then adds tuples from one relation that does not match tuples in the other relation to the result of the join, using NULL
values.
There are three forms of outer join:
-
LEFT OUTER JOIN
Preserves tuples only in the relation named to the left of the
LEFT OUTER JOIN
operation. -
RIGHT OUTER JOIN
Preserves tuples only in the relation named to the right of the
RIGHT OUTER JOIN
operation. -
FULL OUTER JOIN
Preserves tuples in both relations.
In constrast, the join operations that do not preserve nonmatched tuples are called INNER JOIN
operations.
Outer join with ON
construct
For outer join, ON
and WHERE
behave differently. The reason for this is that outer join adds NULL
-padded tuples only for those tuples that do not contribute to the result of the corresponding inner join.
The ON
condition is part of the outer join specification, whilst WHERE
clause is not.
Example
Suppose that the relation student
has a tuple with name Snow
, which has no corresponding tuples in the takes
relation.
The following two queries are not equivalent.
In the latter query, every tuple satisfies the join condition TRUE
, hence the outer join actually generates the Cartesian product of the two relations. Since there are no tuples in takes
with the ID '70557'
, every time a tuple appears in the outer join with the name 'Snow'
, the values for student.ID
and takes.ID
must be different, and such tuples are eliminated by the WHERE
clause predicate. Thus the resultset is empty.