Subqueries and Derived tables
SQL subqueries are a powerful feature that allow you to write nested queries within another query. A subquery is a query nested within another query that is used to return data that is used in the main query. Subqueries are used to answer questions about data that is stored in a database, and they are a key component of advanced SQL programming.
Basic Syntax
The basic syntax of a subquery is as follows:
SQLSELECT column_name(s) FROM table_name WHERE operator (SELECT column_name(s) FROM table_name WHERE condition);
The subquery is enclosed in parentheses and is executed first, before the main query. The result of the subquery is then used in the main query.
The intuition behind subqueries is to break down a complex query into smaller, simpler parts. Subqueries allow you to write nested queries within another query to answer more complex questions about the data stored in a database.
Subqueries are often used to calculate intermediate results that are not stored in a single table. These intermediate results can then be used in the main query to answer more complex questions about the data.
Example
For example, let's say you want to find the customers who have made the most purchases in the last year. Suppose you have a table called orders that contains information about orders placed by customers. You want to find the names of customers who have placed more than 2 orders.
Since this information is not stored in a single table, so you would need to use a subquery to calculate this intermediate result. To do this, you can use a subquery as follows:
SQLSELECT first_name, last_name
FROM customers
WHERE (SELECT COUNT(*)
FROM orders
WHERE customers.id = orders.customer_id) > 2;
This query first executes the subquery, which returns the number of orders for each customer. The main query then uses this information to return the names of customers who have placed more than 10 orders.
Derived tables
It's often desirable to use the results of a subquery and join it with other tables or subqueries. This special kind of subquery is known as a derived table. Here's an example of how to rewrite the query from above using a derived table with JOIN instead of a subquery within the WHERE clause:
SQLSELECT first_name, last_name
FROM customers
JOIN (
SELECT id, COUNT(*) as num_orders
FROM orders
GROUP BY customer_id
HAVING COUNT(*) > 2
) as orders_per_customer
ON customers.id = orders_per_customer.id;
Note how the the subquery gets named using the AS table_name syntax and then we can reference this temporary table as if it were a normal table throughout the rest of the query. In the next lesson, we'll look at another method for creating temporary tables, known as common table expressions.
Try it yourself
Using our example customers and orders tables, return a list of customers who have placed at least one order. Try solving it with and without a subquery.