Join Tables

Inner join (Join)

Pasted image 20250824210308.png

The inner join compares all the value from the t1. With the value from t2. If matches the condition then it will add the row to the result

postgres@localhost:dvdrental> select * from basket_a;
+---+----------+
| a | fruit_a  |
|---+----------|
| 1 | apple    |
| 2 | orange   |
| 3 | banana   |
| 4 | cucumber |
+---+----------+
SELECT 4
Time: 0.006s
postgres@localhost:dvdrental> select * from basket_b;
+---+------------+
| b | fruit_b    |
|---+------------|
| 1 | orange     |
| 2 | apple      |
| 3 | watermelon |
| 4 | pear       |
+---+------------+
SELECT 4
Time: 0.006s
postgres@localhost:dvdrental> select a, fruit_a, b, fruit_b from basket_a inner join basket_b on fruit_a = fruit_b;
+---+---------+---+---------+
| a | fruit_a | b | fruit_b |
|---+---------+---+---------|
| 1 | apple   | 2 | apple   |
| 2 | orange  | 1 | orange  |
+---+---------+---+---------+
SELECT 2

Join multiple table

postgres@localhost:dvdrental> select c.customer_id, c.first_name || ' ' || c.last_name customer_name, s.first_name || ' ' || s.last_name staff_name, p.amount, p.payment_date from customer c join payment p using(customer_id) join staff s using(staff_id) limit 5;
+-------------+---------------+--------------+--------+----------------------------+
| customer_id | customer_name | staff_name   | amount | payment_date               |
|-------------+---------------+--------------+--------+----------------------------|
| 341         | Peter Menard  | Jon Stephens | 7.99   | 2007-02-15 22:25:46.996577 |
| 341         | Peter Menard  | Mike Hillyer | 1.99   | 2007-02-16 17:23:14.996577 |
| 341         | Peter Menard  | Mike Hillyer | 7.99   | 2007-02-16 22:41:45.996577 |
| 341         | Peter Menard  | Jon Stephens | 2.99   | 2007-02-19 19:39:56.996577 |
| 341         | Peter Menard  | Jon Stephens | 7.99   | 2007-02-20 17:31:48.996577 |
+-------------+---------------+--------------+--------+----------------------------+
SELECT 5
Time: 0.009s

Note that in this case the join order maters to link the tables together:

  • We need to join payment p using(customer_id) first because customer table and payment table shares customer_id
  • And then we join staff s using(staff_id) since payment table has staff_id with staff table

Left join (left outer join)

Left join contains all the rows of table on the left (the from table). If the value does not represent on the right (the join table), it will display as null

postgres@localhost:dvdrental> select a, fruit_a, b, fruit_b from basket_a left join basket_b on fruit_a = fruit_b;
+---+----------+--------+---------+
| a | fruit_a  | b      | fruit_b |
|---+----------+--------+---------|
| 1 | apple    | 2      | apple   |
| 3 | banana   | <null> | <null>  |
| 4 | cucumber | <null> | <null>  |
| 2 | orange   | 1      | orange  |
+---+----------+--------+---------+
SELECT 4
Time: 0.007s

Right join (right outer join)

Similarly to left join but reverse. Contains all the rows of the table on the right. All the rows from the right (the join table will be retain)

postgres@localhost:dvdrental> select a, fruit_a, b, fruit_b from basket_a right join basket_b on fruit_a = fruit_b;
+--------+---------+---+------------+
| a      | fruit_a | b | fruit_b    |
|--------+---------+---+------------|
| 1      | apple   | 2 | apple      |
| 2      | orange  | 1 | orange     |
| <null> | <null>  | 4 | pear       |
| <null> | <null>  | 3 | watermelon |
+--------+---------+---+------------+
SELECT 4
Time: 0.007s

Full join (full outer join)

Contains all rows of both table and mark as null for those that cannot match

postgres@localhost:dvdrental> select a, fruit_a, b, fruit_b from basket_a full outer join basket_b on fruit_a = fruit_b;
+--------+----------+--------+------------+
| a      | fruit_a  | b      | fruit_b    |
|--------+----------+--------+------------|
| 1      | apple    | 2      | apple      |
| 3      | banana   | <null> | <null>     |
| 4      | cucumber | <null> | <null>     |
| 2      | orange   | 1      | orange     |
| <null> | <null>   | 4      | pear       |
| <null> | <null>   | 3      | watermelon |
+--------+----------+--------+------------+
SELECT 6
Time: 0.008s

Pasted image 20250824212005.png

Table alias

Table can be alias like column. The thing is table alias can be refer to later on in the query whereas column cannot

select f.title from film as f order by f.title limit 5

or we can omit the as

select f.title from film f order by f.title limit 5

When we join multiple table with the same column name. We need to refer them with alias otherwise it will throw the error like f.title in the above. With 1 table by itself it's fine to refer or not

postgres@localhost:dvdrental> select c.customer_id, c.first_name, p.amount, p.payment_date from customer c join payment p on c.customer_id = p.customer_id limit 5;
+-------------+------------+--------+----------------------------+
| customer_id | first_name | amount | payment_date               |
|-------------+------------+--------+----------------------------|
| 341         | Peter      | 7.99   | 2007-02-15 22:25:46.996577 |
| 341         | Peter      | 1.99   | 2007-02-16 17:23:14.996577 |
| 341         | Peter      | 7.99   | 2007-02-16 22:41:45.996577 |
| 341         | Peter      | 2.99   | 2007-02-19 19:39:56.996577 |
| 341         | Peter      | 7.99   | 2007-02-20 17:31:48.996577 |
+-------------+------------+--------+----------------------------+

Self join

It's also necessary to use alias if you want to do a table compare to itself (self join). For example list all the film title which has the same length

postgres@localhost:dvdrental> select f1.title, f2.title, length(f1.title) length from film f1 inner join film f2 on length(f1.title) = length(f2.title) and f
 1.title != f2.title order by length(f1.title) desc limit 5;
+-------------------------+-------------------------+--------+
| title                   | title                   | length |
|-------------------------+-------------------------+--------|
| Desperate Trainspotting | Entrapment Satisfaction | 23     |
| Desperate Trainspotting | Extraordinary Conquerer | 23     |
| Desperate Trainspotting | Trainspotting Strangers | 23     |
| Entrapment Satisfaction | Desperate Trainspotting | 23     |
| Entrapment Satisfaction | Extraordinary Conquerer | 23     |
+-------------------------+-------------------------+--------+
SELECT 5
Time: 0.009s

Using

When the column is the same in 2 table (for example, both table_a and table_b have customer_id) and we need to join using table_a.customer_id = table.customer_id, we can use using(customer_id) to make it cleaner. Functionality wise it's the same

postgres@localhost:dvdrental> select c.first_name, p.amount from customer c inner join payment p using(customer_id) limit 10;
+------------+--------+
| first_name | amount |
|------------+--------|
| Peter      | 7.99   |
| Peter      | 1.99   |
| Peter      | 7.99   |
| Peter      | 2.99   |
| Peter      | 7.99   |
| Peter      | 5.99   |
| Harold     | 5.99   |
| Harold     | 5.99   |
| Harold     | 2.99   |
| Douglas    | 4.99   |
+------------+--------+
SELECT 10
Time: 0.010s

Cross join

cross join does not have the on or any condition statement like other join. It basically trying to combine all the element between 2 tables. For example

postgres@localhost:dvdrental> select * from T1;
+-------+
| label |
|-------|
| A     |
| B     |
+-------+
SELECT 2
Time: 0.005s
postgres@localhost:dvdrental> select * from T2;
+-------+
| score |
|-------|
| 1     |
| 2     |
| 3     |
+-------+
SELECT 3
Time: 0.006s
postgres@localhost:dvdrental> select * from T1 cross join t2;
+-------+-------+
| label | score |
|-------+-------|
| A     | 1     |
| B     | 1     |
| A     | 2     |
| B     | 2     |
| A     | 3     |
| B     | 3     |
+-------+-------+
SELECT 6
Time: 0.006s

Natural join

Similar to cross join, natural join takes no condition either. It will automatically join 2 columns based on the similar column name. For example

postgres@localhost:dvdrental> select * from categories;
+-------------+---------------+
| category_id | category_name |
|-------------+---------------|
| 1           | Smartphone    |
| 2           | Laptop        |
| 3           | Tablet        |
| 4           | VR            |
+-------------+---------------+
SELECT 4
Time: 0.006s
postgres@localhost:dvdrental> select * from products;
+------------+-----------------+-------------+
| product_id | product_name    | category_id |
|------------+-----------------+-------------|
| 1          | iPhone          | 1           |
| 2          | Samsung Galaxy  | 1           |
| 3          | HP Elite        | 2           |
| 4          | Lenovo Thinkpad | 2           |
| 5          | iPad            | 3           |
| 6          | Kindle Fire     | 3           |
+------------+-----------------+-------------+
SELECT 6
Time: 0.006s
postgres@localhost:dvdrental> select * from products natural join categories;
+-------------+------------+-----------------+---------------+
| category_id | product_id | product_name    | category_name |
|-------------+------------+-----------------+---------------|
| 1           | 1          | iPhone          | Smartphone    |
| 1           | 2          | Samsung Galaxy  | Smartphone    |
| 2           | 3          | HP Elite        | Laptop        |
| 2           | 4          | Lenovo Thinkpad | Laptop        |
| 3           | 5          | iPad            | Tablet        |
| 3           | 6          | Kindle Fire     | Tablet        |
+-------------+------------+-----------------+---------------+
SELECT 6
Time: 0.006s

By default nautural join will perform a inner join if you don't specify anything. This is the same as

select * from products inner join category using(category_id)

We can do

SELECT select_list FROM table1 NATURAL [INNER, LEFT, RIGHT] JOIN table2;

[!danger]
Be careful when using natural join, sometimes if there is an extra same column (for example if there is a last_update column) of 2 different tables, natural join will try to use that column as well and might cause unexpected behavior