HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

MySQL DELETE statement doesn't use index although the same SELECT query does

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
samethedeletestatementqueryselectmysqldoesndoesalthough

Problem

I've got a table with ~30 million rows (and soon twice/triple times more) where I have to do quite regular updates. The table structure is like the following:

id, 
cookie_id VARCHAR(45), 
country VARCHAR(45), 
category VARCHAR(45), 
other_non_relevant_columns


Indexes look like this:

SHOW INDEX FROM data;
+-------+------------+------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name               | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| data  |          0 | PRIMARY                |            1 | id          | A         |    24767570 |     NULL | NULL   |      | BTREE      |         |               |
| data  |          1 | cookie_index           |            1 | cookie_id   | A         |    14440214 |     NULL | NULL   |      | BTREE      |         |               |
| data  |          1 | country_category_index |            1 | country     | A         |         498 |     NULL | NULL   |      | BTREE      |         |               |
| data  |          1 | country_category_index |            2 | category    | A         |         997 |     NULL | NULL   | YES  | BTREE      |         |               |
+-------+------------+------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
4 rows in set (0.00 sec)


So there's a non-unique index on cookie_id, and non-unique index on country+category columns. Now the case is, every week I should run query to

  • Delete all data belonging to country='Y' AND category='X' (5 to 20 million rows)



  • Import fresh data (similar amount)



The problem is, deleting the d

Solution

I suppose if you try to run "EXPLAIN SELECT *" instead of "SELECT id, cookie_id" then server will prefer to use table scan too because execution plan with index seek will require a lot (millions) of key lookups. The same consideration works for DELETE statement. So delete with table scan should be the fastest non-partitioned solution. If you want to reduce duration of locking periods you can use batches as suggested in @anisakras answer.

Context

StackExchange Database Administrators Q#254026, answer score: 2

Revisions (0)

No revisions yet.