patternsqlMinor
Speeding up a NOT IN subquery
Viewed 0 times
speedingsubquerynot
Problem
Good evening,
I'm currently up to try to improve the performance of some of my queries.
As far as I know, statements like "IN" or "NOT IN" are even faster with a large amount of values if a subquery on an indexed field without conditions is used.
When it comes to use conditions at
I was thinking about using
How may I solve this problem? Currently, I've no clue about that.
The original query looked sth. like this
```
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE si index PRIMARY,stream_2_item,stream_id_found stream_2_item 4 \N 663236 Using temporary; Using filesort
1 SIMPLE ui eq_ref user_id_item_id,user_2_item,user_id_read_saved_hidden,user_id_saved,user_id_hidden user_id_item_id 8 const,si.item_id 1 Using where
1 SIMPLE i eq_ref PRIMARY PRIMARY 4 si.item_id
I'm currently up to try to improve the performance of some of my queries.
As far as I know, statements like "IN" or "NOT IN" are even faster with a large amount of values if a subquery on an indexed field without conditions is used.
SELECT * FROM table1 WHERE field1 NOT IN (SELECT index_field FROM table2)When it comes to use conditions at
table2, the query becomes realy slow on a large amount of data.SELECT *
FROM table1
WHERE
field1 NOT IN (
SELECT
index_field
FROM table2
WHERE
user_id = '2'
)I was thinking about using
LEFT Join but when I need to to the filtering for the user_id, it also becomes slow.How may I solve this problem? Currently, I've no clue about that.
The original query looked sth. like this
SELECT
i.*
FROM stream_item si
LEFT JOIN user__publisher_item ui
ON ui.user_id = 2
&& ui.item_id = si.item_id
INNER JOIN item i
ON i.id = si.item_id
WHERE
si.stream_id IN (5,7,8,9,19,24,29,42,43,44)
&& ui.id IS NULL || (
ui.`read` = 0 || ui.saved = 1
)
GROUP BY
si.item_id
ORDER BY
si.`found` DESC
LIMIT 0,10```
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE si index PRIMARY,stream_2_item,stream_id_found stream_2_item 4 \N 663236 Using temporary; Using filesort
1 SIMPLE ui eq_ref user_id_item_id,user_2_item,user_id_read_saved_hidden,user_id_saved,user_id_hidden user_id_item_id 8 const,si.item_id 1 Using where
1 SIMPLE i eq_ref PRIMARY PRIMARY 4 si.item_id
Solution
In general you can replace a
In contrast, if the DB optimizer isn't smart enough to figure out what your doing and the subquery has a lot of rows, then it must buffer all the rows for each subquery before performing each
Try out something like this:
In this example the optimizer need only check if a single row exists for the matching subquery, including the additional filter on
NOT IN (...) subquery with a NOT EXISTS (...) and the latter will be faster. Logically they are the same though some database optimizers handle the NOT EXISTS better as they can stop the subquery check after the first match.In contrast, if the DB optimizer isn't smart enough to figure out what your doing and the subquery has a lot of rows, then it must buffer all the rows for each subquery before performing each
NOT IN check.Try out something like this:
SELECT *
FROM table1 a
WHERE NOT EXISTS(SELECT 1
FROM table2 b
WHERE b.user_id = 2
AND b.index_field = a.field1)In this example the optimizer need only check if a single row exists for the matching subquery, including the additional filter on
b.user_id = 2. It should use an index on the index_field and user_id if they are available.Code Snippets
SELECT *
FROM table1 a
WHERE NOT EXISTS(SELECT 1
FROM table2 b
WHERE b.user_id = 2
AND b.index_field = a.field1)Context
StackExchange Database Administrators Q#46702, answer score: 3
Revisions (0)
No revisions yet.