SQL Joins
SQL Joins are used to combine data from multiple tables in a database. The combinations are based on a common field beween the tables. The common field is called a join key. The join key is a field that exists in both the tables. The join key is used to match rows from one table with rows from another table. The join key is also called a join column or a join field.
For the demonstration, we will use the following tables - books and authors.
books table
id | title | author_id |
---|---|---|
1 | The Alchemist | 2 |
2 | A Game of Thrones | 5 |
3 | The Tempest | 1 |
4 | The Fault in Our Stars | 3 |
5 | A Dance with Dragons | 5 |
6 | The Anthropocene Reviewed | 3 |
authors table
author_id | name |
---|---|
1 | William Shakespeare |
2 | Paulo Coelho |
3 | John Green |
4 | J. K. Rowling |
5 | George R. R. Martin |
Types of SQL Joins
There are four types of SQL Joins:
- Inner Join
- Left Join
- Right Join
- Full Join
Inner Join
The inner join is the most common type of join. The inner join returns only the rows that have matching values in both the tables. The inner join is also called an equi-join. The inner join is the default join in SQL. If you do not specify a join type, the inner join is used. The inner join is represented by the keyword INNER JOIN and the symbol ⋈(U+22C8).
The following SQL statement returns the books and the authors who wrote them.
SELECT books.title, authors.name FROM books INNER JOIN authors ON books.author_id = authors.author_id;
The ON books.author_id = authors.author_id
clause is the join condition. The clause can be replaced with the USING
clause - USING (author_id)
. The USING
clause is used when the two tables have the same column name.
The result of the above query is:
title | name |
---|---|
The Alchemist | Paulo Coelho |
A Game of Thrones | George R. R. Martin |
The Tempest | William Shakespeare |
The Fault in Our Stars | John Green |
A Dance with Dragons | George R. R. Martin |
The Anthropocene Reviewed | John Green |
The total number of rows in the result is 6.
If you don't specify the join condition, the inner join returns the Cartesian product of the two tables. The Cartesian product is the result of the multiplication of the number of rows in the first table with the number of rows in the second table. The result of the Cartesian product is a huge table with all the possible combinations of the rows from the two tables.
In the above example, the books table has 6 rows and the authors table has 5 rows. The Cartesian product of the two tables is 30 rows. The following SQL statement returns the Cartesian product of the two tables.
SELECT books.title, authors.name FROM books INNER JOIN authors;
Left Join
The left join returns all the rows from the left table and the matching rows from the right table. The left join is represented by the keyword LEFT JOIN and the symbol ⟕(U+27D5). The left join is also called a left outer join or a left equi-join.
The following SQL statement returns all the books and the authors who wrote them.
SELECT books.title, authors.name FROM books LEFT JOIN authors ON books.author_id = authors.author_id;
The result of the above query is:
title | name |
---|---|
The Alchemist | Paulo Coelho |
A Game of Thrones | George R. R. Martin |
The Tempest | William Shakespeare |
The Fault in Our Stars | John Green |
A Dance with Dragons | George R. R. Martin |
The Anthropocene Reviewed | John Green |
The total number of rows in the result is 6 (number of rows in the books table).
The left join returns all the rows from the left table - books. The left join returns the rows from the right table - authors only if the join condition is true. The left join returns NULL
for the rows from the right table if the join condition is false.
Right Join
The right join returns all the rows from the right table and the matching rows from the left table. The right join is represented by the keyword RIGHT JOIN and the symbol ⟖(U+27D6). The right join is also called a right outer join or a right equi-join.
The following SQL statement returns all the books and the authors who wrote them.
SELECT books.title, authors.name FROM books RIGHT JOIN authors ON books.author_id = authors.author_id;
The result of the above query is:
title | name |
---|---|
The Tempest | William Shakespeare |
The Alchemist | Paulo Coelho |
The Fault in Our Stars | John Green |
The Anthropocene Reviewed | John Green |
NULL | J. K. Rowling |
A Game of Thrones | George R. R. Martin |
A Dance with Dragons | George R. R. Martin |
The total number of rows in the result is 7 (number of rows in the authors table).
The right join returns all the rows from the right table - authors. The right join returns the rows from the left table - books only if the join condition is true. The right join returns NULL
for the rows from the left table if the join condition is false.
Here, the author J. K. Rowling has no books in the books table. The right join returns NULL
for the title column for the author J. K. Rowling.
Full Join
The full join returns all the rows from both the tables. The full join is represented by the keyword FULL JOIN and the symbol ⟗(U+27D7). The full join is also called a full outer join or a full equi-join.
The following SQL statement returns all the books and the authors who wrote them.
SELECT books.title, authors.name FROM books FULL JOIN authors ON books.author_id = authors.author_id;
The result of the above query is:
title | name |
---|---|
The Tempest | William Shakespeare |
The Alchemist | Paulo Coelho |
The Fault in Our Stars | John Green |
The Anthropocene Reviewed | John Green |
NULL | J. K. Rowling |
A Game of Thrones | George R. R. Martin |
A Dance with Dragons | George R. R. Martin |
The total number of rows in the result is 7 (number of rows in the books table) + 1 (number of extra rows in the authors table) = 8.
The full join returns all the rows from both the tables. The full join returns NULL
for the rows from the left table if the join condition is false. Similarly, the full join returns NULL
for the rows from the right table if the join condition is false.
Condition Claus in SQL Joins
The condition clause is used to specify the join condition. The condition clause can be specified using the ON
clause or the USING
clause.
The ON
clause is used to specify the join condition when the two tables have different column names. The USING
clause is used to specify the join condition when the two tables have the same column name.
The following SQL statement returns the books written by the author George R. R. Martin.
SELECT books.title, authors.name FROM books INNER JOIN authors ON books.author_id = authors.author_id WHERE authors.name = 'George R. R. Martin';
The result of the above query is:
title | name |
---|---|
A Game of Thrones | George R. R. Martin |
A Dance with Dragons | George R. R. Martin |
The total number of rows in the result is 2.
References
- Cartesian product. Wikipedia. From https://en.wikipedia.org/wiki/Cartesian_product
- Relational algebra. Wikipedia. From https://en.wikipedia.org/wiki/Relational_algebra
- SQL JOIN. Dofactory. From https://www.dofactory.com/sql/join
- SQL Joins. W3Schools. From https://www.w3schools.com/sql/sql_join.asp