patternsqlModerate
Large delete query seems to have frozen
Viewed 0 times
deletequeryseemslargefrozenhave
Problem
We ran a delete query on a database with 1.8bn rows. This delete would delete 1.2bn rows.
In hindsight, we would have broken up this query into 100m at a time but we are in a position where it has been running for 24 hours and the log file is at 2Tb which appears to be the maximum size allowed for a log file.
The database is in SIMPLE recovery mode.
Is there any saving this query? Or do we need to just restart the SQL Server and see what happens? Will the database be unusable? Is there anything we can do to kill this off as cleanly as possible?
In hindsight, we would have broken up this query into 100m at a time but we are in a position where it has been running for 24 hours and the log file is at 2Tb which appears to be the maximum size allowed for a log file.
The database is in SIMPLE recovery mode.
Is there any saving this query? Or do we need to just restart the SQL Server and see what happens? Will the database be unusable? Is there anything we can do to kill this off as cleanly as possible?
Solution
First of all, check the SQL errorlog to see if it actually hit a max size for the log. If it did, then the query has no hope of completing, it is probably already in a rollback state.
Even if it is, I always prefer to kill the spid manually (use
Since this is a delete, and not an update or insert, you may be very lucky and find that it rolls back immediately. If not, it may take as long (or longer) to roll back as it did to get to this point.
To see the rollback status, use
Unfortunately, I have found this frequently does not show anything useful, just a "0% complete". In that case, you'll have to use
Regarding the reboot, this is a grave risk. If the spid is actively rolling back (CPU and IO are changing), then restarting SQL will only take the database offline entirely until the rollback is completely finished (hours and hours). But, if the CPU and IO are not moving, then it may in fact clear it right away. Either way, it is a risk.
One final option, if things are especially dire: If you have a backup from just before the delete started (and there haven't been other updates to the db), then the fastest way to recover may be to simply drop the DB, restart SQL, and restore from backup.
If you can't drop the DB (or if you already restarted the instance and the sql errorlog is predicting a 24-hour recovery time), then shut down SQL services, delete the MDF and LDF files from disk, start up SQL, drop the (ghost) database, and restore from backup.
Obviously you'd only attempt that if this were a back-end processing database that users didn't interact with.
Even if it is, I always prefer to kill the spid manually (use
sp_who2 or sp_WhoIsActive to find the spid, then do a kill 59 or whatever). You also can't check the rollback status unless you do an explicit KILL, see this related thread.Since this is a delete, and not an update or insert, you may be very lucky and find that it rolls back immediately. If not, it may take as long (or longer) to roll back as it did to get to this point.
To see the rollback status, use
kill 59 with statusonlyUnfortunately, I have found this frequently does not show anything useful, just a "0% complete". In that case, you'll have to use
sp_who2 and watch the IO and CPU to see if it is still doing something.Regarding the reboot, this is a grave risk. If the spid is actively rolling back (CPU and IO are changing), then restarting SQL will only take the database offline entirely until the rollback is completely finished (hours and hours). But, if the CPU and IO are not moving, then it may in fact clear it right away. Either way, it is a risk.
One final option, if things are especially dire: If you have a backup from just before the delete started (and there haven't been other updates to the db), then the fastest way to recover may be to simply drop the DB, restart SQL, and restore from backup.
If you can't drop the DB (or if you already restarted the instance and the sql errorlog is predicting a 24-hour recovery time), then shut down SQL services, delete the MDF and LDF files from disk, start up SQL, drop the (ghost) database, and restore from backup.
Obviously you'd only attempt that if this were a back-end processing database that users didn't interact with.
Code Snippets
kill 59 with statusonlyContext
StackExchange Database Administrators Q#166561, answer score: 14
Revisions (0)
No revisions yet.