patternsqlModerate
MySQL subquery slows down drastically, but they work fine independently
Viewed 0 times
fineindependentlybutmysqlsubquerydownslowsworktheydrastically
Problem
Query 1:
takes 0.1s
Query 2:
takes 0.2s
Query 3:
takes 22 minutes and 90% its in the "preparing" state. Why does this take so much time.
Table is innodb with 3.2mil rows on MySQL 5.0
select distinct email from mybigtable where account_id=345takes 0.1s
Query 2:
Select count(*) as total from mybigtable where account_id=123 and email IN ()takes 0.2s
Query 3:
Select count(*) as total from mybigtable where account_id=123 and email IN (select distinct email from mybigtable where account_id=345)takes 22 minutes and 90% its in the "preparing" state. Why does this take so much time.
Table is innodb with 3.2mil rows on MySQL 5.0
Solution
In Query 3, you are basically executing a subquery for every row of mybigtable against itself.
To avoid this, you need to make two major changes:
MAJOR CHANGE #1 : Refactor the Query
Here is your original query
You could try
or maybe the count per email
MAJOR CHANGE #2 : Proper Indexing
I think you have this already since Query 1 and Query 2 run fast. Make sure you have a compound index on (account_id,email). Do
UPDATE 2012-03-07 13:26 EST
If you want to do a NOT IN(), change the
UPDATE 2012-03-07 14:13 EST
Please read these two links on doing JOINs
Here is a great YouTube Video where I learned to refactor queries and the book it was based on
To avoid this, you need to make two major changes:
MAJOR CHANGE #1 : Refactor the Query
Here is your original query
Select count(*) as total from mybigtable
where account_id=123 and email IN
(select distinct email from mybigtable where account_id=345)You could try
select count(*) EmailCount from
(
select tbl123.email from
(select email from mybigtable where account_id=123) tbl123
INNER JOIN
(select distinct email from mybigtable where account_id=345) tbl345
using (email)
) A;or maybe the count per email
select email,count(*) EmailCount from
(
select tbl123.email from
(select email from mybigtable where account_id=123) tbl123
INNER JOIN
(select distinct email from mybigtable where account_id=345) tbl345
using (email)
) A group by email;MAJOR CHANGE #2 : Proper Indexing
I think you have this already since Query 1 and Query 2 run fast. Make sure you have a compound index on (account_id,email). Do
SHOW CREATE TABLE mybigtable\G and make sure you have one. If you don't have it or if you are not sure, then create the index anyway:ALTER TABLE mybigtable ADD INDEX account_id_email_ndx (account_id,email);UPDATE 2012-03-07 13:26 EST
If you want to do a NOT IN(), change the
INNER JOIN to a LEFT JOIN and check for the right side being NULL, like this:select count(*) EmailCount from
(
select tbl123.email from
(select email from mybigtable where account_id=123) tbl123
LEFT JOIN
(select distinct email from mybigtable where account_id=345) tbl345
using (email)
WHERE tbl345.email IS NULL
) A;UPDATE 2012-03-07 14:13 EST
Please read these two links on doing JOINs
- http://en.wikipedia.org/wiki/Sql_join
- http://dev.mysql.com/doc/refman/5.5/en/join.html
Here is a great YouTube Video where I learned to refactor queries and the book it was based on
- http://youtu.be/ZVisY-fEoMw
- http://www.amazon.com/Refactoring-SQL-Applications-Stephane-Faroult/dp/B006TR0ZEO
Code Snippets
Select count(*) as total from mybigtable
where account_id=123 and email IN
(select distinct email from mybigtable where account_id=345)select count(*) EmailCount from
(
select tbl123.email from
(select email from mybigtable where account_id=123) tbl123
INNER JOIN
(select distinct email from mybigtable where account_id=345) tbl345
using (email)
) A;select email,count(*) EmailCount from
(
select tbl123.email from
(select email from mybigtable where account_id=123) tbl123
INNER JOIN
(select distinct email from mybigtable where account_id=345) tbl345
using (email)
) A group by email;ALTER TABLE mybigtable ADD INDEX account_id_email_ndx (account_id,email);select count(*) EmailCount from
(
select tbl123.email from
(select email from mybigtable where account_id=123) tbl123
LEFT JOIN
(select distinct email from mybigtable where account_id=345) tbl345
using (email)
WHERE tbl345.email IS NULL
) A;Context
StackExchange Database Administrators Q#14565, answer score: 11
Revisions (0)
No revisions yet.