patternsqlMinor
DELETE ... IN hangs after deleting half of a large table (even for deleting 10 rows)
Viewed 0 times
afterrowsdeletingdeletelargeforevenhangshalftable
Problem
I'm having trouble with a DELETE hanging, even for a small number of rows (10) in a large MariaDB InnoDB table.
Background
I had a large table (~400,000,000 rows) that needed to have rows archived and removed. After removing about half of the rows, my archive process started to hang on the DELETE statement. I took the time to run the (painfully slow) 11-hour OPTIMIZE (ALTER for InnoDB) query to reclaim disk space and hopefully unblock the DELETE process. This worked until I deleted about half of the remaining rows (down to about 106m out of 200m rows). I figure I could keep running OPTIMIZE on my giant table to let me remove another 1/2 of the rows, but there must be a better way...
Note that the table does get a lot of INSERT/UPDATE traffic, 1000s of changes per minute.
Archive Process
To simplify the archive script, I put the whole process into a stored procedure. The steps in that procedure are:
Store these in a temporary table A
-
Filter temp table A to only include row IDs that are actually in the live DB.
Store these in a temporary table B
-
DELETE FROM live_db.test_table WHERE test_id IN (SELECT * FROM temp_table_B) ORDER BY test_id ASC LIMIT N;
Now that the table is half empty, step 4 is hanging, even for N = 10
Before the table was half-empty, it worked fine and N = 1000 took ~ 1 sec
Diagnostics
Adding an EXPLAIN in the stored procedure lets me know MySQL is planning on using the PRIMARY index for the DELETE, as I hoped it would:
Output selected from SHOW ENG
Background
I had a large table (~400,000,000 rows) that needed to have rows archived and removed. After removing about half of the rows, my archive process started to hang on the DELETE statement. I took the time to run the (painfully slow) 11-hour OPTIMIZE (ALTER for InnoDB) query to reclaim disk space and hopefully unblock the DELETE process. This worked until I deleted about half of the remaining rows (down to about 106m out of 200m rows). I figure I could keep running OPTIMIZE on my giant table to let me remove another 1/2 of the rows, but there must be a better way...
Note that the table does get a lot of INSERT/UPDATE traffic, 1000s of changes per minute.
Archive Process
To simplify the archive script, I put the whole process into a stored procedure. The steps in that procedure are:
- Flatten and copy up to N rows from the live DB to the archive DB, where the rows are older than 1 month
- Make a list of archived IDs that are probably in the live DB.
Store these in a temporary table A
-
Filter temp table A to only include row IDs that are actually in the live DB.
Store these in a temporary table B
-
DELETE FROM live_db.test_table WHERE test_id IN (SELECT * FROM temp_table_B) ORDER BY test_id ASC LIMIT N;
Now that the table is half empty, step 4 is hanging, even for N = 10
Before the table was half-empty, it worked fine and N = 1000 took ~ 1 sec
Diagnostics
Adding an EXPLAIN in the stored procedure lets me know MySQL is planning on using the PRIMARY index for the DELETE, as I hoped it would:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY test_runs index NULL PRIMARY 4 NULL 10 Using where
8 DEPENDENT SUBQUERY tests_to_drop ALL NULL NULL NULL NULL 1 Using whereOutput selected from SHOW ENG
Solution
I found out what's going on here:
Because of how I was choosing test_id values in steps 1 and 2, temp_table B would sometimes be empty. (see question description for step/table context)
Then, with an empty temp_table_B, I was running step 4:
which gave
which would hang for quite a while.
Based on this answer: https://dba.stackexchange.com/a/86103/144766 , it seems like that empty table was probably compared against each of my 106 million rows, making the do-nothing DELETE take 25 minutes.
Now that I have modified my process to detect the empty table and abort before calling DELETE, I can manually watch for the (data-specific) condition that's causing my list-of-rows-to-delete to occasionally become empty.
EDIT: Even after fixing the empty-in-list problem, my query was greatly sped up by switching from the " IN (SELECT * temp_table) " version to a JOIN version, similar to what is suggested in comments by ypercube above. This modification eliminates the 'SUBQUERY' from the execution plan and the query becomes a 'SIMPLE' query:
Because of how I was choosing test_id values in steps 1 and 2, temp_table B would sometimes be empty. (see question description for step/table context)
Then, with an empty temp_table_B, I was running step 4:
DELETE FROM live_db.test_table WHERE test_id
IN (SELECT * FROM temp_table_B) ORDER BY test_id ASC LIMIT N;which gave
DELETE FROM live_db.test_table WHERE test_id
IN ( ) ORDER BY test_id ASC LIMIT N;which would hang for quite a while.
Based on this answer: https://dba.stackexchange.com/a/86103/144766 , it seems like that empty table was probably compared against each of my 106 million rows, making the do-nothing DELETE take 25 minutes.
Now that I have modified my process to detect the empty table and abort before calling DELETE, I can manually watch for the (data-specific) condition that's causing my list-of-rows-to-delete to occasionally become empty.
EDIT: Even after fixing the empty-in-list problem, my query was greatly sped up by switching from the " IN (SELECT * temp_table) " version to a JOIN version, similar to what is suggested in comments by ypercube above. This modification eliminates the 'SUBQUERY' from the execution plan and the query becomes a 'SIMPLE' query:
'IN' version:
id select_type table type key key_len rows Extra
1 PRIMARY test_runs index PRIMARY 4 10 Using where
8 DEPENDENT SUBQUERY temp_table_B ALL NULL NULL 10 Using where
'JOIN' version:
id select_type table type key key_len ref rows Extra
1 SIMPLE temp_table_B ALL NULL NULL NULL 10 Using where
1 SIMPLE test_runs eq_ref PRIMARY 4 temp_table_B.test_id 1Code Snippets
DELETE FROM live_db.test_table WHERE test_id
IN (SELECT * FROM temp_table_B) ORDER BY test_id ASC LIMIT N;DELETE FROM live_db.test_table WHERE test_id
IN ( <empty> ) ORDER BY test_id ASC LIMIT N;'IN' version:
id select_type table type key key_len rows Extra
1 PRIMARY test_runs index PRIMARY 4 10 Using where
8 DEPENDENT SUBQUERY temp_table_B ALL NULL NULL 10 Using where
'JOIN' version:
id select_type table type key key_len ref rows Extra
1 SIMPLE temp_table_B ALL NULL NULL NULL 10 Using where
1 SIMPLE test_runs eq_ref PRIMARY 4 temp_table_B.test_id 1Context
StackExchange Database Administrators Q#197927, answer score: 3
Revisions (0)
No revisions yet.