HiveBrain v1.2.0
Get Started
← Back to all entries
patternModerate

Do subqueries add expressive power to SQL queries?

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
sqlpowerexpressivequeriessubqueriesadd

Problem

Does SQL need subqueries?

Imagine a sufficiently generalized implementation of the structured query language for relation databases. Since the structure of the canonical SQL SELECT statement is actually pretty important for this to make sense, I don't appeal directly to relational algebra, but you could frame this in those terms by making appropriate restrictions on the form of expressions.

An SQL SELECT query generally consists of a projection (the SELECT part) some number of JOIN operations (the JOIN part), some number of SELECTION operations (in SQL, the WHERE clauses), and then set-wise operations (UNION, EXCEPT, INTERSECT, etc.), followed by another SQL SELECT query.

Tables being joined can be the computed results of expressions; in other words, we can have a statement such as:

SELECT t1.name, t2.address
  FROM table1 AS t1 
  JOIN (SELECT id, address 
          FROM table2 AS t3 
         WHERE t3.id = t1.id) AS t2
 WHERE t1.salary > 50,000;


We will refer to the use of a computed table as part of an SQL query as a subquery. In the example above, the second (indented) SELECT is a subquery.

Can all SQL queries be written in such a way as to not use subqueries? The example above can:

SELECT t1.name, t2.address
  FROM table1 AS t1 
  JOIN table2 AS t2
    ON t1.id = t2.id
 WHERE t1.salary > 50,000;


This example is somewhat spurious, or trivial, but one can imagine instances where considerably more effort might be required to recover an equivalent expression. In other words, is it the case that for every SQL query $q$ with subqueries, there exists a query $q'$ without subqueries such that $q$ and $q'$ are guaranteed to produce the same results for the same underlying tables? Let us limit SQL queries to the following form:

SELECT ,
      ...,
      
 FROM 
 JOIN 
  ...
 JOIN 
WHERE 
  AND 
  ...
  AND 

UNION
 -or-
EXCEPT
 -or-

SELECT ...


And so on. I think left and right outer joins don't add much, but if

Solution

To translate your statement into the relational algebra, I think it asks:


Can we rewrite $\sigma_A(A)\bowtie \sigma_B(B)$ as $\sigma_A(\sigma_B(A\bowtie B))$?

(Where $\sigma$ is select and $\bowtie$ is join.)

The answer is "Yes," and it's a standard query optimization. To be honest, I'm not sure how to prove this in a non-question-begging manner - it's just a property of selection and join. You can argue inductively to add however many layers of nested queries you want.

Additionally, you might ask:


Can every sequence of joins be written as $A\bowtie B\bowtie C\bowtie\dots$ [As opposed to, say, $(A\bowtie B)\bowtie(C\bowtie D)$]?

Again the answer is yes, because join is associative. Similar statements can be made about projection as well.

One notable type of "subquery" which I think cannot be "flattened out" is with. One way to see this is to note that if you have a with statement then you can have a recursive function, which cannot be written without using subqueries.

So to sum up: in the specific case you mentioned, no, SQL does not need subqueries, and you can prove it inductively. In general though, there are features which require subqueries.

Context

StackExchange Computer Science Q#129, answer score: 16

Revisions (0)

No revisions yet.