HiveBrain v1.2.0
Get Started
← Back to all entries
patternModerate

Are there any fully declarative implementations of SQL

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
implementationsdeclarativearesqlanyfullythere

Problem

I was recently bitten by a bug ( in my code ) where these two queries had drastically different runtimes:

select * from smalltable st
    inner join bigtable bt on st.btid = bt.btid

select * from bigtable bt
    inner join smalltable st on bt.btid = st.btid


As well as the individual queries finding different execution plans for different strings in the where filters.

Are there any optimizing, fully declarative implementations of the SQL standard, such that the two above queries would have equal execution plans?

Solution

Any database with a cost-based optimizer (which includes basically any enterprise-class database) would generally treat those two queries as identical. But no cost-based optimizer is or can be perfect-- they will all occasionally pick a poor plan.

If you're using a cost-based optimizer and the optimizer has reasonably accurate information about the tables and indexes (this information is generally gathered asynchronously and generally gathered in the evening/ morning when other activity is relatively light), it should be relatively clear (for some values of smalltable and bigtable and assuming proper indexes) that the most efficient plan is to do a full table scan of smalltable and then do index lookups on bigtable for those rows using the join condition. If the database thinks that smalltable has 10% of the rows in bigtable rather than 0.1% of the rows in bigtable, on the other hand, there are a lot of different plans that have to be considered. As the number of tables, the number of predicates, and the number of access paths the database supports grows, the potential number of query plans grows exponentially. But the time the optimizer has to find the best plan does not grow-- you're generally going to be upset if the database takes a couple minutes to figure out the best plan before it even starts to execute the query. So cost-based optimizers will never be perfect-- they don't have the time to fully explore every path and they rely on statistics that may be stale in order to figure out which plans are optimal. Most of the time, however, they're quite good at figuring out the best way to join two tables when a query has no predicates and a simple join condition.

Context

StackExchange Database Administrators Q#14709, answer score: 11

Revisions (0)

No revisions yet.