snippetMinor
Adding an identity field to be the new cluster of a large table - how do I get the "right" order?
Viewed 0 times
identityfieldtheneworderrightaddinglargegethow
Problem
I have a very large table (88m rows, nearly 300GB in size) that has a poorly planned clustered index. I want to add a
The problem is that I also want this new identity field to be ordered in a certain way on the current records (based on an existing date field that is not the current cluster). This is so when we purge the oldest records, those will all be contiguous on disk. (The date field isn't unique, though, so isn't a good candidate for the cluster by itself.)
How can I get the values for this new identity field ordered the way I want?
The first idea I had was to make a new table with the identify field, and just
If I don't have enough disk space to do that, is there any way to do that "in-place"? Perhaps by changing the clustered index to the date field (wait a long time), then add the identity (and wait a long time again)? Are the values of the identity guaranteed to be ordered by the current cluster?
BIGINT identity field, and make it the new clustered index (having the desirable attributes of unique, narrow, static, and ever-increasing). The problem is that I also want this new identity field to be ordered in a certain way on the current records (based on an existing date field that is not the current cluster). This is so when we purge the oldest records, those will all be contiguous on disk. (The date field isn't unique, though, so isn't a good candidate for the cluster by itself.)
How can I get the values for this new identity field ordered the way I want?
The first idea I had was to make a new table with the identify field, and just
INSERT INTO rows from the current table ORDER BY the date field? Would that work? If I don't have enough disk space to do that, is there any way to do that "in-place"? Perhaps by changing the clustered index to the date field (wait a long time), then add the identity (and wait a long time again)? Are the values of the identity guaranteed to be ordered by the current cluster?
Solution
I just performed the following experiment on a SQL Server 2012 box:
Result:
Next:
Results:
And, one more test run:
Results:
I suppose this isn't a guarantee that the clustered index dictates the order a new IDENTITY column gets filled in - but, I'd say it strongly indicates that it's very likely.
CREATE TABLE rdtest (label varchar(100), value int);
INSERT INTO rdtest VALUES
('one', 1)
,('two', 4)
,('three', 9)
,('four', 16)
,('five', 25)
;
ALTER TABLE rdtest ADD test_id int IDENTITY(1,1);
SELECT * FROM rdtest;Result:
label value test_id
-------- ----------- -----------
one 1 1
two 4 2
three 9 3
four 16 4
five 25 5Next:
ALTER TABLE rdtest DROP COLUMN test_id;
ALTER TABLE rdtest ALTER COLUMN value int NOT NULL;
GO
ALTER TABLE rdtest ADD CONSTRAINT PK_rdtest PRIMARY KEY ([value] DESC);
ALTER TABLE rdtest ADD test_id int IDENTITY(1,1);
SELECT * FROM rdtest;Results:
label value test_id
-------- ----------- -----------
five 25 1
four 16 2
three 9 3
two 4 4
one 1 5And, one more test run:
ALTER TABLE rdtest DROP CONSTRAINT PK_rdtest;
ALTER TABLE rdtest DROP COLUMN test_id;
ALTER TABLE rdtest ALTER COLUMN label varchar(100) NOT NULL;
GO
ALTER TABLE rdtest ADD CONSTRAINT PK_rdtest PRIMARY KEY ([label] ASC);
ALTER TABLE rdtest ADD test_id int IDENTITY(1,1);
SELECT * FROM rdtest;Results:
label value test_id
-------- ----------- -----------
five 25 1
four 16 2
one 1 3
three 9 4
two 4 5I suppose this isn't a guarantee that the clustered index dictates the order a new IDENTITY column gets filled in - but, I'd say it strongly indicates that it's very likely.
Code Snippets
CREATE TABLE rdtest (label varchar(100), value int);
INSERT INTO rdtest VALUES
('one', 1)
,('two', 4)
,('three', 9)
,('four', 16)
,('five', 25)
;
ALTER TABLE rdtest ADD test_id int IDENTITY(1,1);
SELECT * FROM rdtest;label value test_id
-------- ----------- -----------
one 1 1
two 4 2
three 9 3
four 16 4
five 25 5ALTER TABLE rdtest DROP COLUMN test_id;
ALTER TABLE rdtest ALTER COLUMN value int NOT NULL;
GO
ALTER TABLE rdtest ADD CONSTRAINT PK_rdtest PRIMARY KEY ([value] DESC);
ALTER TABLE rdtest ADD test_id int IDENTITY(1,1);
SELECT * FROM rdtest;label value test_id
-------- ----------- -----------
five 25 1
four 16 2
three 9 3
two 4 4
one 1 5ALTER TABLE rdtest DROP CONSTRAINT PK_rdtest;
ALTER TABLE rdtest DROP COLUMN test_id;
ALTER TABLE rdtest ALTER COLUMN label varchar(100) NOT NULL;
GO
ALTER TABLE rdtest ADD CONSTRAINT PK_rdtest PRIMARY KEY ([label] ASC);
ALTER TABLE rdtest ADD test_id int IDENTITY(1,1);
SELECT * FROM rdtest;Context
StackExchange Database Administrators Q#172297, answer score: 2
Revisions (0)
No revisions yet.