patternsqlMinor
Repeating the same function in a query
Viewed 0 times
samethequeryrepeatingfunction
Problem
In the below query, there are repeated calculations such as the three calls to
It seems like it would be faster after the first calculation to get the next one by the alias name,
This query runs on the MySQL Sakila sample database.
SUM(p.amount). Does MySQL re-calculate for each function call or is there some kind of memoization optimization under the hood? If not, how can this kind of query be optimized for maximum performance?It seems like it would be faster after the first calculation to get the next one by the alias name,
total_payemnts, but that just throws an error.SELECT LEFT(c.last_name, 1) AS 'last_names',
SUM(p.amount) AS 'total_payments',
COUNT(p.rental_id) AS 'num_rentals',
SUM(p.amount) / COUNT(p.rental_id) AS 'avg_pay'
FROM customer c
JOIN payment p ON p.customer_id = c.customer_id
GROUP BY LEFT(c.last_name, 1)
ORDER BY SUM(p.amount) DESC;This query runs on the MySQL Sakila sample database.
Solution
tl;dr: yes, there is sort of a memorization.
Anything I could test suggested that your
In source code file
https://github.com/mysql/mysql-server/blob/71f48ab393bce80a59e5a2e498cd1f46f6b43f9a/sql/item_sum.cc
My reading of this is that among the many things that gets done when you write a query, the parser has to break apart your statement. We're looking only at the
I don't claim to be a MySQL internals expert, so I could be wrong on this, but seeing this code comment was enough to convince me that the design isn't naive: MySQL doesn't simply read your query from left to right and execute everything it encounters.
Side note, to optimizing the performance of this query, don't look to the
Anything I could test suggested that your
SUM() function is NOT calculated twice, so your original query was already optimal. That said, this is deep internal behaviour, so not something you'd likely find in the MySQL docs. The the way to know for sure is to actually look under the hood in the source code for MySQL to see how it actually works. Fortunately the code is pretty well commented, so you don't need to be fluent in C. In source code file
sql/item_sum.cc, there's a comment that says this:Don't do anything if
1) this is an unresolved item (This may happen if an
expression occurs twice in the same query. In that case, the
whole item tree for the second occurence is replaced by the
item tree for the first occurence, without calling fix_fields()
on the second tree. Therefore there's nothing to clean up.)https://github.com/mysql/mysql-server/blob/71f48ab393bce80a59e5a2e498cd1f46f6b43f9a/sql/item_sum.cc
My reading of this is that among the many things that gets done when you write a query, the parser has to break apart your statement. We're looking only at the
SELECT part of the query, though there's just as complex stuff that happens to break down any JOINs, GROUPs, ORDERs. One of the thing that is done to the SELECT fields is that the expressions that make them up are loaded into into an "Item tree", which is a memory structure. The server then has to "resolve" those items, which includes things like checking whether the object exists and whether you have privileges to access it. If the same expression is seen twice, per the code comment, the "whole item tree for the second occurrence is replaced by the item tree for the first occurrence".I don't claim to be a MySQL internals expert, so I could be wrong on this, but seeing this code comment was enough to convince me that the design isn't naive: MySQL doesn't simply read your query from left to right and execute everything it encounters.
Side note, to optimizing the performance of this query, don't look to the
SUM(), like @Rick said, you won't find the slowness there. Look instead to potential indexes.Code Snippets
Don't do anything if
1) this is an unresolved item (This may happen if an
expression occurs twice in the same query. In that case, the
whole item tree for the second occurence is replaced by the
item tree for the first occurence, without calling fix_fields()
on the second tree. Therefore there's nothing to clean up.)Context
StackExchange Database Administrators Q#155059, answer score: 3
Revisions (0)
No revisions yet.