principlesqlMinor
What is the best approach to run a DELETE query on 92K of 4 million rows?
Viewed 0 times
million92ktherowswhatdeletequeryapproachrunbest
Problem
Options that have occurred to me:
-
One big delete statement. For example:
-
Same as above, but chunk in smaller pieces, 10k at a time?
-
Create a stored procedure so all I'm sending are the the two query values.
-
One big delete statement. For example:
delete from x where (y = 'x' and x = 'y') or (y = 'a' and x = 'b')-
Same as above, but chunk in smaller pieces, 10k at a time?
-
Create a stored procedure so all I'm sending are the the two query values.
Solution
92K out of 4M is still only 2.3% of your records. I'd suggest your first option, just throw what you want deleted at the database and let the database sort it out (if you give it to the database all at once, chances are it'll do it more efficiently than if you parcel it out). The only time I'd recommend chunking it is if your application is time-critical and you're worried about slowing it down (in which case the other solution is to do it off-hours).
If you were taking 20-30% or more of the records out of the table, I'd recommend dropping indexes, doing the delete, and recreating indexes, followed by an
That said, after you've done the delete, running
If you were taking 20-30% or more of the records out of the table, I'd recommend dropping indexes, doing the delete, and recreating indexes, followed by an
OPTIMIZE TABLE, but for 2% of the records in the table the index rebuild overhead isn't worth it (it's faster just to keep the indexes up-to-date while deleting the records).That said, after you've done the delete, running
OPTIMIZE TABLE might not be a bad idea - don't do that during peak hours though.Context
StackExchange Database Administrators Q#8514, answer score: 9
Revisions (0)
No revisions yet.