HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

SQL Server - creating a clustered index in a large heap - should I add a column?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
clusteredcolumnsqlcreatingheaplargeshouldserverindexadd

Problem

I'm going to add a clustered index to a fairly large table:

  • over 7 million records



  • 106 columns



  • no PK



  • no other indexes.



There is an Id column which could potentially be used for the index key, but it's nvarchar(18) and not unique (although there are no duplicate values there).

I'm however reluctant to use it because of its datatype (performance impact) and have been thinking of adding an int identity(1,1) column and use it as the clustered index key.

Which solution would you rather recommend?
TIA

Solution

In general, a surrogate ID (an incrementing number that doesn't have any other meaning) is the best choice unless you have a specific reason why it won't work.

One of the biggest problems with using a non-incrementing value for a primary key is that you'll have page splits on some inserts, which will cause them to be slower. How much slower depends on a lot of factors, so it's just something you'd have to measure on the system.

Google "benefits of surrogate primary key" and you'll find a ton of information. It's really a small book to cover the topic, so the full answer isn't really appropriate for this site. This question also has a few answers that provide good information on this topic.

Context

StackExchange Database Administrators Q#265893, answer score: 6

Revisions (0)

No revisions yet.