Subquery

Intro

SELECT select_list FROM table1 WHERE columnA operator ( SELECT columnB FROM table2 WHERE condition );

in this syntax here the subquery in parenthesis will execute first.

For example

postgres@localhost:dvdrental> select country_id from country where country = 'United States';
+------------+
| country_id |
|------------|
| 103        |
+------------+

After this, we want to execute

select city from city where country_id = 103;

We can actually merging these 2 queries

select city from city where country_id = (select country_id from country where country = 'United States');

Using in

Since the subqueries can return 1 or more rows, we can use the in operators

SELECT film_id FROM film_category INNER JOIN category USING(category_id) WHERE name = 'Action';

film_id
---------
      19
      21
      29
      38
      56
select title from film where film_id in (select film_id from film_category inner join category using(category_id) where name = 'Action');

Correlated subquery

Correlated subquery are the subqueries that used outer query information. For example:

SELECT film_id, title, length, rating
FROM film f
WHERE length > (
    SELECT AVG(length)
    FROM film
    WHERE rating = f.rating
);

In here, the subquery refers to the rating from outer query (f.rating).

In this case, SQL will execute the outer query first:

SELECT film_id, title, length, rating
FROM film f

For each of the row from this query, it will then execute the subquery

  SELECT AVG(length)
    FROM film
    WHERE rating = f.rating

And then filter out from the original rows to match the condition WHERE length >

As a result, it will caused some performance impact

postgres@localhost:dvdrental> select film_id, title, length, rating from film f where length > (select avg(length) from film where rating = f.rating);
Time: 0.502s

Any

expression operator ANY(subquery)
  • expression is the field you wanna compare
  • Operator is normally =, <, >, <=, >=, and <>
  • subquery is another subquery
postgres@localhost:dvdrental> select * from employees
+----+------------+-----------+----------+
| id | first_name | last_name | salary   |
|----+------------+-----------+----------|
| 1  | Bob        | Williams  | 45000.00 |
| 2  | Charlie    | Davis     | 55000.00 |
| 3  | David      | Jones     | 50000.00 |
| 4  | Emma       | Brown     | 48000.00 |
| 5  | Frank      | Miller    | 52000.00 |
| 6  | Grace      | Wilson    | 49000.00 |
| 7  | Harry      | Taylor    | 53000.00 |
| 8  | Ivy        | Moore     | 47000.00 |
| 9  | Jack       | Anderson  | 56000.00 |
| 10 | Kate       | Hill      | 44000.00 |
| 11 | Liam       | Clark     | 59000.00 |
| 12 | Mia        | Parker    | 42000.00 |
+----+------------+-----------+----------+
SELECT 12
Time: 0.007s
postgres@localhost:dvdrental> select * from managers
+----+------------+-----------+----------+
| id | first_name | last_name | salary   |
|----+------------+-----------+----------|
| 1  | John       | Doe       | 60000.00 |
| 2  | Jane       | Smith     | 55000.00 |
| 3  | Alice      | Johnson   | 58000.00 |
+----+------------+-----------+----------+
SELECT 3
Time: 0.005s

if we want to select an employee whose salary is the same as manager, we can do the following

postgres@localhost:dvdrental> select e.first_name, e.last_name, e.salary from employees e where e.salary = any(select salary from managers);
+------------+-----------+----------+
| first_name | last_name | salary   |
|------------+-----------+----------|
| Charlie    | Davis     | 55000.00 |
+------------+-----------+----------+
SELECT 1
Time: 0.008s
postgres@localhost:dvdrental>

ALL

expression operator ALL(subquery)

Similar to the above, we can do

postgres@localhost:dvdrental>  select e.first_name, e.last_name, e.salary from employees e where e.salary > any(select salary from managers);
+------------+-----------+----------+
| first_name | last_name | salary   |
|------------+-----------+----------|
| Bob        | Williams  | 75000.00 |
| Jack       | Anderson  | 56000.00 |
| Liam       | Clark     | 59000.00 |
+------------+-----------+----------+

Note: the data here is from a different table described above.

Exists

EXISTS (subquery)

This basically check existent of a row in subquery

SELECT
  select_list
FROM
  table1
WHERE
  EXISTS(
    SELECT
      1
    FROM
      table2
    WHERE
      condition
  );

For example

postgres@localhost:dvdrental> select exists (select 1 from payment where amount = 0);
+--------+
| exists |
|--------|
| True   |
+--------+

The select 1 from payment where amount = 0 simply print 1 for every column that has amount = 0.

select exists simply return true if there is one or more rows returned from subquery

In this example, we use exists to find the customer that has payment > 11. For this, we need to compare the p.customer_id = c.customer_id.

select first_name, last_name from customer c where exists( select amount from payment p where p.customer_id = c.customer_id and amount > 11);
+------------+-----------+
| first_name | last_name |
|------------+-----------|
| Karen      | Jackson   |
| Victoria   | Gibson    |
| Vanessa    | Sims      |
| Rosemary   | Schmidt   |
| Tanya      | Gilbert   |
| Nicholas   | Barfield  |
| Kent       | Arsenault |
| Terrance   | Roush     |
+------------+-----------+
SELECT 8
Time: 0.010s

Similarly, we can use the not exists for negate query

Exists with null

Note that exists(null) will rerturn true

postgres@localhost:dvdrental> select exists (select null);
+--------+
| exists |
|--------|
| True   |
+--------+
SELECT 1

So if we do something like this, it will return all the customers since it's always true

select first_name, last_name from customer where exists(select null);