HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Round-tripping column datatype causes size of table to grow

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
datatypecolumnsizetrippinggrowroundtablecauses

Problem

Table LogCount:

Column1 int       not null
Column2 int       not null
Column3 int       not null
[Date]  datetime  not null
[Count] float(53) not null


This table contains ~86 million rows and has the following indexes:

alter table LogCount add constraint PK_LogCount_Id primary key clustered 
( [Date], Column1, Column2, Column3 )
go
create nonclustered index IX_Column2_Date on LogCount ( Column2, [Date] )
include ( [Count] )
go


sp_spaceused gives the following:

name        rows        reserved    data        index_size  unused  
LogCount    85800181    8089216 KB  4226664 KB  3860968 KB  1584 KB


The Count column doesn't and never will store floating-point numbers, only integers, so I changed it to a smallint which (I expected) will save 6 bytes per row (float(53) = 8 bytes, smallint = 2 bytes):

drop index LogCount.IX_Column2_Date
go
alter table LogCount alter column [Count] smallint not null
go
create nonclustered index IX_Column2_Date on LogCount ( Column2, [Date] )
include ( [Count] )
go


Then I reran sp_spaceused:

name        rows        reserved    data        index_size  unused
LogCount    85800181    7670848 KB  5255528 KB  2414496 KB  824 KB


As expected, the index size has decreased drastically, but the data size has increased by a gigabyte!

I then reran the drop/alter/create statement above but using int (4 bytes) and got the following result:

name        rows        reserved    data        index_size  unused
LogCount    85800181    7848032 KB  5255528 KB  2591688 KB  816 KB


Then I tried float(1) (also 4 bytes):

name        rows        reserved    data        index_size  unused
LogCount    85800181    7848016 KB  5255528 KB  2591672 KB  816 KB


Finally I went back to the original float(53):

name        rows        reserved    data        index_size  unused
LogCount    85800181    10680584 KB 7726896 KB  2952464 KB  1224 KB


Compared to the original, the data size

Solution

When changing the data type of a column, SQL Server will choose to either:

  • Change the metadata only;



  • Change the metadata and read all the existing values to ensure they fit; or



  • Change every row physically



Even if every row must be changed, SQL Server still take steps (where possible) to prioritize speed over final size, on the basis that we want DDL changes to complete as quickly as possible. Optimizing the storage space can wait for a maintenance window.

Changing float to smallint can be accommodated within the existing space allocated for the row, but it does leave some unused space. As has been mentioned, this can be reclaimed by fully rebuilding the changed structure.

Context

StackExchange Database Administrators Q#132981, answer score: 3

Revisions (0)

No revisions yet.