HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Do any SQL dialects permit the logical sequence of SELECT clauses?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
thesqlanydialectspermitsequenceselectlogicalclauses

Problem

SELECT in ISO/IEC-standard SQL prescribes the following syntactical order for the sub-clauses:
SELECT
projection-expressions
FROM
sources
WHERE
predicate-expression
GROUP BY
key-expression
HAVING
predicate-expression
ORDER BY
ordering-expressions


While the logical execution order is this:
FROM
sources
WHERE
predicate-expression
GROUP BY
value-expression
HAVING
value-expression
SELECT
projection-expressions
ORDER BY
ordering-expressions


To novice users of SQL it becomes surprising that a projection defined in the SELECT clause is not available in the WHERE or GROUP BY clauses even though it's declared first - given that computer programs generally follow the top-down execution order.

It's also surprising that SQL authors are required to repeat their expressions in SELECT, WHERE, and GROUP BY clauses redundantly or use a subquery which does not lend itself to a succinct query. At least when the user is familiar with the actual clause execution order they know why they need to repeat themselves, but that doesn't stop it being frustrating.

This problem, and other related issues, is documented in this article I found: https://blog.jooq.org/a-beginners-guide-to-the-true-order-of-sql-operations/ and it's no surprise that a QA on StackOverflow has almost 30,000 views: https://stackoverflow.com/questions/3241352/using-an-alias-column-in-the-where-clause-in-postgresql

It made me wonder if any implementations of SQL allow for this more "logical" ordering of clauses. I note that Linq in .NET does actually follow this order, though I wouldn't describe it as being a true SQL implementation, but indeed, in Linq the equivalent would be:
source // FROM equivalent
.Where( predicate-expression )
.GroupBy( key-expression )
.Where( predicate-expression ) // HAVING equivalent
.Select( projection-expression )
.OrderBy( ordering-expression )


(I also like how Linq lets you add a Select() proj

Solution

implementations of SQL

No, SQL is standardized. There aren't vastly different implementations of a standard. That kind of defeats the purpose. There was/is always a fight over the relational-calculus (which SQL pretends to be) vs relational-algebra approach. It an old debate,

Having selected a model for the data, our next step was to
select a query language. Two families of higher level data sublanguages
for a relational data base are based on, alternatively, the relational
algebra (derived from the algebra of sets) or the relational calculus
[Codd, 197lb , 1971c and Date , 1975] (derived from the predicate calculus). Codd [1971c] and Date [1975] have compared these two and found the relational calculus to be superior, particularly for use as a target language for a - natural language system. The main reason for their choice of the relational calculus as a target language was that the calculus is non-procedural; i.e., a query in the relational calculus conveys little information about how to proceed in searching the data base. The relational algebra is more procedural which makes the automatic construction of a query by a natural language system somewhat more difficult. - Implementation of a Query Language based on the Relational Calculus 1970

And, again from Codd himself,

Early in the development of the relational model (1969-1972), I invented
two languages for dealing with relations: one algebraic in nature, and one
based on first-order predicate logic [Codd 1971a]. I then proved that the
two languages had equal expressive power [Codd 1971d], but indicated that
the logic-based language would be more optimizable (assuming that flow
tracing was not attempted) and easier to use as an interface to inferential software on top of the DBMS. - Codd 1990, The Relational Model For Database Management Version 2

There are certainly a ton of implementations to both approaches. That said, SQL as an implementation is pretty fixed on the calculus approach.
LINQ problems

These problems never really went away with the algebraic aproach, take for instance from this blog,.

context.Cars
  .OrderBy(x => x.Id)
  .Skip(50000)
  .Take(1000)
  .ToList();


Requires hand-optimization as,

context.Cars
    .Where(x => context.Cars
        .OrderBy(y => y.Id)
        .Select(y => y.Id)
        .Skip(50000)
        .Take(1000)
        .Contains(x.Id)
    )
    .ToList();


If you have multiple different relations with different performance characteristics (indexes, most of which are implemented as indirect and abstract) and statistics, it's just hard to write procedural set logic that works efficiently.

Code Snippets

context.Cars
  .OrderBy(x => x.Id)
  .Skip(50000)
  .Take(1000)
  .ToList();
context.Cars
    .Where(x => context.Cars
        .OrderBy(y => y.Id)
        .Select(y => y.Id)
        .Skip(50000)
        .Take(1000)
        .Contains(x.Id)
    )
    .ToList();

Context

StackExchange Database Administrators Q#196292, answer score: 9

Revisions (0)

No revisions yet.