patternsqlMinor
Recreating Clustered Index Efficiently
Viewed 0 times
recreatingclusteredefficientlyindex
Problem
In a test database, I am looking to:
My workflow is as above as well, with the addition of disabling all the non-clustered indexes prior to dropping the clustered.
Since dropping the clustered constraint index requires the table to save as a HEAP, the amount of time this process takes on our 45m row table is tremendous. The drop on the constraint has been going for 1:17:00 and seems to only be at about 31m (based on Logical Reads in Spotlight for the Session).
Is there a more efficient way to handle this workflow? Perhaps a way to drop the constraint index and rebuild as the new clustered index, rather than as a HEAP?
Thanks,
Wes
DDL Statements:
TABLE STRUCTURE
```
CREATE TABLE [dbo].hist NOT NULL,
[part] varchar NULL,
[date] [datetime] NULL,
[per_date] [datetime] NULL,
[type] varchar NULL,
[loc] varchar NULL,
[loc_begin] decimal NULL,
[begin_qoh] decimal NULL,
[qty_req] decimal NULL,
[qty_chg] decimal NULL,
[qty_short] decimal NULL,
[um] varchar NULL,
[last_date] [datetime] NULL,
[nbr] varchar NULL,
[so_job] varchar NULL,
[ship_type] varchar NULL,
[addr] varchar NULL,
[rmks] varchar NULL,
[xdr_acct] varchar NULL,
[xcr_acct] varchar NULL,
[mtl_std] decimal NULL,
[lbr_std] decimal NULL,
[bdn_std] decimal NULL,
[price] decimal NULL,
[trnbr] [int] NULL,
[gl_amt] decimal NULL,
[xdr_cc] varchar NULL,
[xcr_cc] varchar NULL,
[lot] varchar NULL,
[sub_std] decimal NULL,
[gl_date] [datetime] NUL
- Drop clustered index (they are Primary Key Clustered Constraints on a row that is super useless for us.)
- Create new Clustered index
- re-create Primary Key constraint as nonclustered index
- rebuild all other non-clustered indexes.
My workflow is as above as well, with the addition of disabling all the non-clustered indexes prior to dropping the clustered.
Since dropping the clustered constraint index requires the table to save as a HEAP, the amount of time this process takes on our 45m row table is tremendous. The drop on the constraint has been going for 1:17:00 and seems to only be at about 31m (based on Logical Reads in Spotlight for the Session).
Is there a more efficient way to handle this workflow? Perhaps a way to drop the constraint index and rebuild as the new clustered index, rather than as a HEAP?
Thanks,
Wes
DDL Statements:
TABLE STRUCTURE
```
CREATE TABLE [dbo].hist NOT NULL,
[part] varchar NULL,
[date] [datetime] NULL,
[per_date] [datetime] NULL,
[type] varchar NULL,
[loc] varchar NULL,
[loc_begin] decimal NULL,
[begin_qoh] decimal NULL,
[qty_req] decimal NULL,
[qty_chg] decimal NULL,
[qty_short] decimal NULL,
[um] varchar NULL,
[last_date] [datetime] NULL,
[nbr] varchar NULL,
[so_job] varchar NULL,
[ship_type] varchar NULL,
[addr] varchar NULL,
[rmks] varchar NULL,
[xdr_acct] varchar NULL,
[xcr_acct] varchar NULL,
[mtl_std] decimal NULL,
[lbr_std] decimal NULL,
[bdn_std] decimal NULL,
[price] decimal NULL,
[trnbr] [int] NULL,
[gl_amt] decimal NULL,
[xdr_cc] varchar NULL,
[xcr_cc] varchar NULL,
[lot] varchar NULL,
[sub_std] decimal NULL,
[gl_date] [datetime] NUL
Solution
Ideally you would do something like this:
That would skip the step of the table being converted to a heap. Unfortunately, step 1 doesn't appear to be possible in SQL Server.
When the primary key is deleted, the corresponding index is deleted.
In addition, BOL has this to say about changing a primary key with
If the index enforces a PRIMARY KEY or UNIQUE constraint and the index definition is not altered in any way, the index is dropped and re-created preserving the existing constraint. However, if the index definition is altered the statement fails. To change the definition of a PRIMARY KEY or UNIQUE constraint, drop the constraint and add a constraint with the new definition.
As far as I can tell, the best that you can do is to avoid the heap conversion by creating a copy of the table and moving all data there. Removing or adding a clustered index creates an internal copy of the data anyway so it's not like it'll require more space. Here are a few hints to speed that up:
As an aside, if you were curious to see step 2 in action (I was), here's some sample code which shows how the heap conversion step can be skipped:
SQL Server Execution Times:
CPU time = 31 ms, elapsed time = 51 ms.
SQL Server Execution Times:
CPU time = 2406 ms, elapsed time = 3484 ms.
SQL Server Execution Times:
CPU time = 2422 ms, elapsed time = 3411 ms.
- Drop the existing primary key constraint but keep the clustered index.
- Recreate the clustered index on the new columns with the
DROP_EXISTING = ONoption set.
- Create the primary key constraint on a new nonclustered index.
That would skip the step of the table being converted to a heap. Unfortunately, step 1 doesn't appear to be possible in SQL Server.
When the primary key is deleted, the corresponding index is deleted.
In addition, BOL has this to say about changing a primary key with
DROP_EXISTING = ON:If the index enforces a PRIMARY KEY or UNIQUE constraint and the index definition is not altered in any way, the index is dropped and re-created preserving the existing constraint. However, if the index definition is altered the statement fails. To change the definition of a PRIMARY KEY or UNIQUE constraint, drop the constraint and add a constraint with the new definition.
As far as I can tell, the best that you can do is to avoid the heap conversion by creating a copy of the table and moving all data there. Removing or adding a clustered index creates an internal copy of the data anyway so it's not like it'll require more space. Here are a few hints to speed that up:
- You probably shouldn't use
SELECT INTO. That will copy the data to a heap which is the step you're trying to avoid. However, both theSELECT INTOand the creation of the clustered index are eligible for parallelism in SQL Server 2014.
- Take advantage of minimal logging if your recovery model allows for it. Note that for
INSERT INTO... SELECTyou'll need aTABLOCKhint against the target table to get minimal logging.
- Create your nonclustered indexes after all data is loaded.
- When creating the nonclustered indexes use the
SORT_IN_TEMPDB = ONoption if tempdb is sized for it.
- Check for foreign keys on other tables. If you're able to disable those that might help speed up things.
As an aside, if you were curious to see step 2 in action (I was), here's some sample code which shows how the heap conversion step can be skipped:
DROP TABLE IF EXISTS dbo.X_NUMBERS_1000000;
CREATE TABLE dbo.X_NUMBERS_1000000 (ID INT NOT NULL, ID2 INT NOT NULL, FILLER VARCHAR(500));
CREATE CLUSTERED INDEX CI_X_NUMBERS_1000000 ON dbo.X_NUMBERS_1000000 (ID);
INSERT INTO dbo.X_NUMBERS_1000000 WITH (TABLOCK)
SELECT TOP (1000000)
ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
, ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
, REPLICATE('Z', 500)
FROM master..spt_values t1
CROSS JOIN master..spt_values t2;
-- option 1
DROP INDEX X_NUMBERS_1000000.CI_X_NUMBERS_1000000;
CREATE CLUSTERED INDEX CI_X_NUMBERS_1000000_2_COL ON dbo.X_NUMBERS_1000000 (ID, ID2);SQL Server Execution Times:
CPU time = 31 ms, elapsed time = 51 ms.
SQL Server Execution Times:
CPU time = 2406 ms, elapsed time = 3484 ms.
-- option 2 (after resetting the table)
CREATE CLUSTERED INDEX CI_X_NUMBERS_1000000 ON dbo.X_NUMBERS_1000000 (ID, ID2)
WITH (DROP_EXISTING = ON);SQL Server Execution Times:
CPU time = 2422 ms, elapsed time = 3411 ms.
Code Snippets
DROP TABLE IF EXISTS dbo.X_NUMBERS_1000000;
CREATE TABLE dbo.X_NUMBERS_1000000 (ID INT NOT NULL, ID2 INT NOT NULL, FILLER VARCHAR(500));
CREATE CLUSTERED INDEX CI_X_NUMBERS_1000000 ON dbo.X_NUMBERS_1000000 (ID);
INSERT INTO dbo.X_NUMBERS_1000000 WITH (TABLOCK)
SELECT TOP (1000000)
ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
, ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
, REPLICATE('Z', 500)
FROM master..spt_values t1
CROSS JOIN master..spt_values t2;
-- option 1
DROP INDEX X_NUMBERS_1000000.CI_X_NUMBERS_1000000;
CREATE CLUSTERED INDEX CI_X_NUMBERS_1000000_2_COL ON dbo.X_NUMBERS_1000000 (ID, ID2);-- option 2 (after resetting the table)
CREATE CLUSTERED INDEX CI_X_NUMBERS_1000000 ON dbo.X_NUMBERS_1000000 (ID, ID2)
WITH (DROP_EXISTING = ON);Context
StackExchange Database Administrators Q#171961, answer score: 4
Revisions (0)
No revisions yet.