HiveBrain v1.2.0
Get Started
← Back to all entries
gotchasqlMinor

Difference between a unique clustered index and a clustered primary key?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
uniqueclusteredprimarydifferencebetweenandindexkey

Problem

I'd like to ask two supplementary/follow-on questions, further to this previous/existing question: Is there any tangible difference between a unique clustered index and a clustered primary key?

-
That question starts with, "I understand that there may be a difference in meaning or intent between the two". I'm a programmer not a DBA, and this fundamental might be unclear to me: what is the difference in meaning or intent between the two?

-
My summary of the accepted answer, i.e. its most important statement IMO, is that it says, "I don't think there's any difference". If that's so then why did Microsoft implement "clustered indexes"? Why not just say instead, "It's always clustered on the primary key, and you should define as the primary key whatever you want it to be clustered on"?

It seems to me that a primary key (already) is a unique clustered index.

Furthermore, here's a specific problem by way of example.

Let's say I have a table of Users (with a userId as its primary key), and a second table (e.g. Items) which defines items owned by each user. A user can own many items; each item is owned by one user, and has an itemId.

So the itemId could be the primary key of the Items table; and each row in the Items table (which has an itemId) also has a userId to identify its owner.

That's a good way to define a 1-N relationship, isn't it? Assume a foreign key contraint on userId, with Users being the parent table.

At run-time I usually want to retrieve all the items owned by a user, therefore the Items table should be clustered on its userId column.

[Users]
  userId
  + plus other user-specific fields

[Items]
  userId
  itemId
  + plus other item-specific fields


I think there are two ways to define this Items table:

  • itemId is primary non-clustered key, and (userId,itemId) is unique clustered index



... or:

  • (userId,itemId) is primary clustered key, and itemId is unique non-clustered index.



Which of the above

Solution

why did Microsoft implement "clustered indexes"? Why not just say instead, "It's always clustered on the primary key, and you should define as the primary key whatever you want it to be clustered on"?

To answer this question: The primary key on the table does not have to be the Clustered Index Key. A Clustered Index is the place where SQL Server stores the data of the entire table. The Index itself has what are called "key values." Key Values (1 to many) define how the index is ordered, along with statistics and other useful things.

In fact, often it's useful to have a Primary Key that is not the Clustered Index Key. For example, some applications create their own keys or GUIDs in order to relate different tables through foreign keys. Using a GUID as a Clustered Index Key is a bad idea though, since GUIDs don't always have an order, unless you create them sequentially. To solve this, you could create a Clustered Index on an identity column, and place the Primary Key on the GUID column.

Your question has many more questions. I think we need to break this down into many questions, since you're asking about a lot of index concepts.

Context

StackExchange Database Administrators Q#149471, answer score: 2

Revisions (0)

No revisions yet.