patternsqlMinor
What are the most effective techniques to reduce SQL Server disk space use when updating all rows in a very large table?
Viewed 0 times
rowsupdatingwhenmostdiskalltheareveryserver
Problem
Given a SQL Server table with
and an update statement which
What techniques can be employed to reduce the peak disk space consuption (including data files, log file and tempdb - if applicable) required to do this update?
For purposes of this question, the following is allowed:
- a large number of rows
- no columns with large-value data types
- multiple indexes
- more allocated space than available for the largest possible transaction log size
- a single-column primary key with clustered index (optional consideration for this question)
- an average record size of 1k (optional consideration for this question)
and an update statement which
- needs to be run against every row
- sets a value on a non-indexed column (optional consideration for this question)
What techniques can be employed to reduce the peak disk space consuption (including data files, log file and tempdb - if applicable) required to do this update?
For purposes of this question, the following is allowed:
- applying changes in batches
- run in single-user mode
- change recovery model
Solution
I've just gone through a similar process just couple of weeks ago. After several tries, with couple of the bigger tables (one of them more than 100million rows, near 80Gb) I came up with these steps to speed up things and keep transaction log small:
-
ALTER DATABASE [my_db] SET RECOVERY SIMPLE WITH NO_WAIT (check here for more info)
-
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED (check here for more info)
-
drop nonclustered indexes: this step will avoid the hassle with the need of updating them on each operation (check Brent Ozar blog for more details)
-
use batches: as suggested by mrdenny, do whatever you need to do, update, delete, inserts, in smaller chunks.
Here is a sample of batch processing for an update, 1000 rows at a time:
-
restore nonclustered indexes
-
SET TRANSACTION ISOLATION LEVEL READ COMMITTED
-
ALTER DATABASE [my_db] SET RECOVERY FULL;
-
ALTER DATABASE [my_db] SET RECOVERY SIMPLE WITH NO_WAIT (check here for more info)
-
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED (check here for more info)
-
drop nonclustered indexes: this step will avoid the hassle with the need of updating them on each operation (check Brent Ozar blog for more details)
-
use batches: as suggested by mrdenny, do whatever you need to do, update, delete, inserts, in smaller chunks.
Here is a sample of batch processing for an update, 1000 rows at a time:
UPDATE TOP(1000) your_table
SET col1 = new_value
WHERE
WHILE @@rowcount > 0
BEGIN
UPDATE TOP(1000) your_table
SET col1 = new_value
WHERE ;
END;
GO-
restore nonclustered indexes
-
SET TRANSACTION ISOLATION LEVEL READ COMMITTED
-
ALTER DATABASE [my_db] SET RECOVERY FULL;
Code Snippets
UPDATE TOP(1000) your_table
SET col1 = new_value
WHERE <your_condition>
WHILE @@rowcount > 0
BEGIN
UPDATE TOP(1000) your_table
SET col1 = new_value
WHERE <your_condition>;
END;
GOContext
StackExchange Database Administrators Q#60789, answer score: 6
Revisions (0)
No revisions yet.