gotchasqlMajor
Why does ALTER COLUMN to NOT NULL cause massive log file growth?
Viewed 0 times
whyfilemassivecolumnlognullalterdoesgrowthcause
Problem
I have a table with 64m rows taking 4.3 GB on disk for its data.
Each row is about 30 bytes of integer columns, plus a variable
I added a a NULLABLE column with data-type
I then UPDATED this column for every row and made sure all new inserts place a value in this column.
Once there were no NULL entries I then ran this command to make my new field mandatory:
The result was a HUGE growth in the transaction log size - from 6GB to over 36GB until it ran out of space!
Does anyone have any idea what on earth SQL Server 2008 R2 is doing for this simple command to result in such huge growth?
Each row is about 30 bytes of integer columns, plus a variable
NVARCHAR(255) column for text.I added a a NULLABLE column with data-type
Datetimeoffset(0).I then UPDATED this column for every row and made sure all new inserts place a value in this column.
Once there were no NULL entries I then ran this command to make my new field mandatory:
ALTER TABLE tblCheckResult
ALTER COLUMN [dtoDateTime] [datetimeoffset](0) NOT NULLThe result was a HUGE growth in the transaction log size - from 6GB to over 36GB until it ran out of space!
Does anyone have any idea what on earth SQL Server 2008 R2 is doing for this simple command to result in such huge growth?
Solution
When you change a column to NOT NULL, SQL Server has to touch every single page, even if there are no NULL values. Depending on your fill factor this could actually lead to a lot of page splits. Every page that is touched, of course, has to be logged, and I suspect due to the splits that two changes may have to be logged for many pages. Since it's all done in a single pass, though, the log has to account for all of the changes so that, if you hit cancel, it knows exactly what to undo.
An example. Simple table:
Now, let's look at the page details. First we need to find out what page and DB_ID we're dealing with. In my case I created a database called
The output indicated that I was interested in page 159 (the only row in
Now, let's look some select page details as we step through the OP's scenario.
Now, I don't have all the answers to this, as I am not a deep internals guy. But it's clear that - while both the update operation and the addition of the NOT NULL constraint undeniably write to the page - the latter does so in an entirely different way. It seems to actually change the structure of the record, rather than just fiddle with bits, by swapping out the nullable column for a non-nullable column. Why it has to do that, I'm not quite sure - a good question for the storage engine team, I guess. I do believe that SQL Server 2012 handles some of these scenarios a lot better, FWIW - but I have yet to do any exhaustive testing.
An example. Simple table:
DROP TABLE dbo.floob;
GO
CREATE TABLE dbo.floob
(
id INT IDENTITY(1,1) NOT NULL PRIMARY KEY CLUSTERED,
bar INT NULL
);
INSERT dbo.floob(bar) SELECT NULL UNION ALL SELECT 4 UNION ALL SELECT NULL;
ALTER TABLE dbo.floob ADD CONSTRAINT df DEFAULT(0) FOR barNow, let's look at the page details. First we need to find out what page and DB_ID we're dealing with. In my case I created a database called
foo, and the DB_ID happened to be 5.DBCC TRACEON(3604, -1);
DBCC IND('foo', 'dbo.floob', 1);
SELECT DB_ID();The output indicated that I was interested in page 159 (the only row in
DBCC IND output with PageType = 1).Now, let's look some select page details as we step through the OP's scenario.
DBCC PAGE(5, 1, 159, 3);UPDATE dbo.floob SET bar = 0 WHERE bar IS NULL;
DBCC PAGE(5, 1, 159, 3);ALTER TABLE dbo.floob ALTER COLUMN bar INT NOT NULL;
DBCC PAGE(5, 1, 159, 3);Now, I don't have all the answers to this, as I am not a deep internals guy. But it's clear that - while both the update operation and the addition of the NOT NULL constraint undeniably write to the page - the latter does so in an entirely different way. It seems to actually change the structure of the record, rather than just fiddle with bits, by swapping out the nullable column for a non-nullable column. Why it has to do that, I'm not quite sure - a good question for the storage engine team, I guess. I do believe that SQL Server 2012 handles some of these scenarios a lot better, FWIW - but I have yet to do any exhaustive testing.
Code Snippets
DROP TABLE dbo.floob;
GO
CREATE TABLE dbo.floob
(
id INT IDENTITY(1,1) NOT NULL PRIMARY KEY CLUSTERED,
bar INT NULL
);
INSERT dbo.floob(bar) SELECT NULL UNION ALL SELECT 4 UNION ALL SELECT NULL;
ALTER TABLE dbo.floob ADD CONSTRAINT df DEFAULT(0) FOR barDBCC TRACEON(3604, -1);
DBCC IND('foo', 'dbo.floob', 1);
SELECT DB_ID();DBCC PAGE(5, 1, 159, 3);UPDATE dbo.floob SET bar = 0 WHERE bar IS NULL;
DBCC PAGE(5, 1, 159, 3);ALTER TABLE dbo.floob ALTER COLUMN bar INT NOT NULL;
DBCC PAGE(5, 1, 159, 3);Context
StackExchange Database Administrators Q#29522, answer score: 49
Revisions (0)
No revisions yet.