HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Recreating Clustered Index Efficiently

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
recreatingclusteredefficientlyindex

Problem

In a test database, I am looking to:

  • Drop clustered index (they are Primary Key Clustered Constraints on a row that is super useless for us.)



  • Create new Clustered index



  • re-create Primary Key constraint as nonclustered index



  • rebuild all other non-clustered indexes.



My workflow is as above as well, with the addition of disabling all the non-clustered indexes prior to dropping the clustered.

Since dropping the clustered constraint index requires the table to save as a HEAP, the amount of time this process takes on our 45m row table is tremendous. The drop on the constraint has been going for 1:17:00 and seems to only be at about 31m (based on Logical Reads in Spotlight for the Session).

Is there a more efficient way to handle this workflow? Perhaps a way to drop the constraint index and rebuild as the new clustered index, rather than as a HEAP?

Thanks,
Wes

DDL Statements:

TABLE STRUCTURE

```
CREATE TABLE [dbo].hist NOT NULL,
[part] varchar NULL,
[date] [datetime] NULL,
[per_date] [datetime] NULL,
[type] varchar NULL,
[loc] varchar NULL,
[loc_begin] decimal NULL,
[begin_qoh] decimal NULL,
[qty_req] decimal NULL,
[qty_chg] decimal NULL,
[qty_short] decimal NULL,
[um] varchar NULL,
[last_date] [datetime] NULL,
[nbr] varchar NULL,
[so_job] varchar NULL,
[ship_type] varchar NULL,
[addr] varchar NULL,
[rmks] varchar NULL,
[xdr_acct] varchar NULL,
[xcr_acct] varchar NULL,
[mtl_std] decimal NULL,
[lbr_std] decimal NULL,
[bdn_std] decimal NULL,
[price] decimal NULL,
[trnbr] [int] NULL,
[gl_amt] decimal NULL,
[xdr_cc] varchar NULL,
[xcr_cc] varchar NULL,
[lot] varchar NULL,
[sub_std] decimal NULL,
[gl_date] [datetime] NUL

Solution

Ideally you would do something like this:

  • Drop the existing primary key constraint but keep the clustered index.



  • Recreate the clustered index on the new columns with the DROP_EXISTING = ON option set.



  • Create the primary key constraint on a new nonclustered index.



That would skip the step of the table being converted to a heap. Unfortunately, step 1 doesn't appear to be possible in SQL Server.


When the primary key is deleted, the corresponding index is deleted.

In addition, BOL has this to say about changing a primary key with DROP_EXISTING = ON:


If the index enforces a PRIMARY KEY or UNIQUE constraint and the index definition is not altered in any way, the index is dropped and re-created preserving the existing constraint. However, if the index definition is altered the statement fails. To change the definition of a PRIMARY KEY or UNIQUE constraint, drop the constraint and add a constraint with the new definition.

As far as I can tell, the best that you can do is to avoid the heap conversion by creating a copy of the table and moving all data there. Removing or adding a clustered index creates an internal copy of the data anyway so it's not like it'll require more space. Here are a few hints to speed that up:

  • You probably shouldn't use SELECT INTO. That will copy the data to a heap which is the step you're trying to avoid. However, both the SELECT INTO and the creation of the clustered index are eligible for parallelism in SQL Server 2014.



  • Take advantage of minimal logging if your recovery model allows for it. Note that for INSERT INTO... SELECT you'll need a TABLOCK hint against the target table to get minimal logging.



  • Create your nonclustered indexes after all data is loaded.



  • When creating the nonclustered indexes use the SORT_IN_TEMPDB = ON option if tempdb is sized for it.



  • Check for foreign keys on other tables. If you're able to disable those that might help speed up things.



As an aside, if you were curious to see step 2 in action (I was), here's some sample code which shows how the heap conversion step can be skipped:

DROP TABLE IF EXISTS dbo.X_NUMBERS_1000000;
CREATE TABLE dbo.X_NUMBERS_1000000 (ID INT NOT NULL, ID2 INT NOT NULL, FILLER VARCHAR(500));

CREATE CLUSTERED INDEX CI_X_NUMBERS_1000000 ON dbo.X_NUMBERS_1000000 (ID);

INSERT INTO dbo.X_NUMBERS_1000000 WITH (TABLOCK)
SELECT TOP (1000000) 
  ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
, ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
, REPLICATE('Z', 500)
FROM master..spt_values t1
CROSS JOIN master..spt_values t2;

-- option 1
DROP INDEX X_NUMBERS_1000000.CI_X_NUMBERS_1000000;
CREATE CLUSTERED INDEX CI_X_NUMBERS_1000000_2_COL ON dbo.X_NUMBERS_1000000 (ID, ID2);



SQL Server Execution Times:
CPU time = 31 ms, elapsed time = 51 ms.


SQL Server Execution Times:
CPU time = 2406 ms, elapsed time = 3484 ms.

-- option 2 (after resetting the table)
CREATE CLUSTERED INDEX CI_X_NUMBERS_1000000 ON dbo.X_NUMBERS_1000000 (ID, ID2) 
WITH (DROP_EXISTING = ON);



SQL Server Execution Times:
CPU time = 2422 ms, elapsed time = 3411 ms.

Code Snippets

DROP TABLE IF EXISTS dbo.X_NUMBERS_1000000;
CREATE TABLE dbo.X_NUMBERS_1000000 (ID INT NOT NULL, ID2 INT NOT NULL, FILLER VARCHAR(500));

CREATE CLUSTERED INDEX CI_X_NUMBERS_1000000 ON dbo.X_NUMBERS_1000000 (ID);

INSERT INTO dbo.X_NUMBERS_1000000 WITH (TABLOCK)
SELECT TOP (1000000) 
  ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
, ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
, REPLICATE('Z', 500)
FROM master..spt_values t1
CROSS JOIN master..spt_values t2;


-- option 1
DROP INDEX X_NUMBERS_1000000.CI_X_NUMBERS_1000000;
CREATE CLUSTERED INDEX CI_X_NUMBERS_1000000_2_COL ON dbo.X_NUMBERS_1000000 (ID, ID2);
-- option 2 (after resetting the table)
CREATE CLUSTERED INDEX CI_X_NUMBERS_1000000 ON dbo.X_NUMBERS_1000000 (ID, ID2) 
WITH (DROP_EXISTING = ON);

Context

StackExchange Database Administrators Q#171961, answer score: 4

Revisions (0)

No revisions yet.