patternsqlMinor
MyISAM Performance: Join Decomposition
Viewed 0 times
joinperformancemyisamdecomposition
Problem
in High Performance MySQL on page 159 they talk about breaking up complex queries into simple ones:
Converting
To
And sort of doing the actual join yourself in your application.
My Question is whether this is stil such a good idea when the final query has a where-clause with a few thousand IDs it needs to match (the actual table itself has about 500k entries).
What I mean is, will there be a big penalty for having a query like
instead of the join-statement above? Would it help to move this logic to Stored Procedures inside the Database (while considering how poorly stored procedures are implemented in MySQL)?
Converting
SELECT * FROM tag
JOIN tag_post ON tag_post.tag_id=tag.id
JOIN post ON tag_post.post_id=post.id
WHERE tag.tag='mysql';To
SELECT * FROM tag WHERE tag='mysql';
SELECT * FROM tag_post WHERE tag_id=1234;
SELECT * FROM post WHERE post.id in (123,456,567,9098,8904);And sort of doing the actual join yourself in your application.
My Question is whether this is stil such a good idea when the final query has a where-clause with a few thousand IDs it needs to match (the actual table itself has about 500k entries).
What I mean is, will there be a big penalty for having a query like
SELECT * FROM post WHERE post.id in (123,456,567, ... ... ,9098,8904);instead of the join-statement above? Would it help to move this logic to Stored Procedures inside the Database (while considering how poorly stored procedures are implemented in MySQL)?
Solution
breaking up complex queries into simple ones
Poppycock. Why do extra effort when MySQL is quite willing to do it for you? As for performance -- there is probably no difference except that the broken up queries require more round trips to the server.
OTOH, there are cases where you can outsmart the optimizer. But your example was not one of those.
IN (thousands-of-ids) is possible, but painful, for the server. It will sort and de-dup them, then leave them in some kind of structure for repeated binary searching. I have seen lots of such queries, but only those over, say, 50K items raised any eyebrows.
There are times when this rewrite helps:
-->
But that is to avoid hauling around extra junk that will be thrown away by the LIMIT.
Poppycock. Why do extra effort when MySQL is quite willing to do it for you? As for performance -- there is probably no difference except that the broken up queries require more round trips to the server.
OTOH, there are cases where you can outsmart the optimizer. But your example was not one of those.
IN (thousands-of-ids) is possible, but painful, for the server. It will sort and de-dup them, then leave them in some kind of structure for repeated binary searching. I have seen lots of such queries, but only those over, say, 50K items raised any eyebrows.
There are times when this rewrite helps:
SELECT ... ORDER BY ... LIMIT ...-->
SELECT b...
FROM tbl b
JOIN ( SELECT id FROM TBL WHERE ... ORDER BY ... LIMIT ... ) a
ON a.id = b.idBut that is to avoid hauling around extra junk that will be thrown away by the LIMIT.
Code Snippets
SELECT ... ORDER BY ... LIMIT ...SELECT b...
FROM tbl b
JOIN ( SELECT id FROM TBL WHERE ... ORDER BY ... LIMIT ... ) a
ON a.id = b.idContext
StackExchange Database Administrators Q#1834, answer score: 3
Revisions (0)
No revisions yet.