patternsqlMinor
How does the MySQL Query Optimizer react to a SELECT COUNT sentence?
Viewed 0 times
countthequeryreactoptimizermysqlsentencedoeshowselect
Problem
Let's say we have a table called "customers" with these columns ("id", "name", "country", "created_at).
There's an index by ("country", "created_at").
If
is executed, does the Query Optimizer select the same path as if no COUNT were requested, as in
?
Rephrasing, does MySQL Query Optimizer take into account sort orders when resolving SELECT COUNT?
There's an index by ("country", "created_at").
If
SELECT COUNT(*)
FROM customers
WHERE ID > 10000
AND country = "US"
ORDER BY country, created_atis executed, does the Query Optimizer select the same path as if no COUNT were requested, as in
SELECT * FROM ...//same conditions?
Rephrasing, does MySQL Query Optimizer take into account sort orders when resolving SELECT COUNT?
Solution
The two queries have a very big difference:
While the second query returns all rows that match the
So, for the first query, there is no sensible reason to have an
You can test at SQL-Fiddle that SQL-Server, when you add
Column "customers.country" is invalid in the ORDER BY clause because it is not contained in either an aggregate function or the GROUP BY clause.
An error is produced in Postgres, too.
But even in MySQL that may allow such non-standard syntax, to add
Oracle (version 11g2) seems to allow such nonsense too. You can see the execution plan here: Oracle-test. Not sure how the plan should be interpreted but it seems that Oracle at least knows that it's one row only so the "sorting" operation is not costly.
----- query 1
SELECT COUNT(*)
FROM customers
WHERE ID > 10000
AND country = 'US' ;
----- query 2
SELECT *
FROM customers
WHERE ID > 10000
AND country = 'US' ;While the second query returns all rows that match the
WHERE conditions, the first one has an aggregate function (COUNT()) in the SELECT list, so it does an aggregation, a collapsing of rows that match the conditions into one row and returns only one number, the number of rows that match the conditions.So, for the first query, there is no sensible reason to have an
ORDER BY. The result is one row only. Even more, it should produce an error as the rows (that have been collapsed into one) may have different values in the country and created_at columns. So, which one should be used for the ordering (say in a case where you had a GROUP BY and the result set was more than one rows)?You can test at SQL-Fiddle that SQL-Server, when you add
ORDER BY country, created_at, it produces the error:Column "customers.country" is invalid in the ORDER BY clause because it is not contained in either an aggregate function or the GROUP BY clause.
An error is produced in Postgres, too.
But even in MySQL that may allow such non-standard syntax, to add
ORDER BY in the first query, the optimizer is smart enough to not take that into account for the execution plan. There is nothing to order. One row will be returned anyway. You can check that by viewing the execution plans with EXPLAIN. Simple test at SQL-Fiddle: Mysql-testOracle (version 11g2) seems to allow such nonsense too. You can see the execution plan here: Oracle-test. Not sure how the plan should be interpreted but it seems that Oracle at least knows that it's one row only so the "sorting" operation is not costly.
Code Snippets
----- query 1
SELECT COUNT(*)
FROM customers
WHERE ID > 10000
AND country = 'US' ;
----- query 2
SELECT *
FROM customers
WHERE ID > 10000
AND country = 'US' ;Context
StackExchange Database Administrators Q#19224, answer score: 3
Revisions (0)
No revisions yet.