patternsqlMinor
Does this computed CHECKSUM() index design make sense?
Viewed 0 times
thiscomputeddesignmakechecksumsensedoesindex
Problem
I have come across what looks to me like a slightly odd pattern in a SQL Server 2005 database I'm taking care of, and was wondering whether it's just me, or whether it really is odd.
There are a number of tables with
Then, there are indexes which contain both of these fields, e.g.
This pattern also pops up with Guids that are not primary keys - e.g., an
Surely, the whole point of a checksum is that you create an index just on the checksum, so a
There are a number of tables with
uniqueidentifier primary keys, which also have a computed column which is the CHECKSUM of that key, e.g.[CustomerGuid] [uniqueidentifier] ROWGUIDCOL NOT NULL,
[CustomerHash] AS (CHECKSUM([CustomerGuid])) PERSISTED,Then, there are indexes which contain both of these fields, e.g.
CREATE NONCLUSTERED INDEX [IX_Customer_CustomerHashAndGuid] ON [dbo].[Customer]
(
[CustomerHash] ASC,
[CustomerGuid] ASC
)This pattern also pops up with Guids that are not primary keys - e.g., an
Order table with CustomerGuid and CustomerHash for each order, and an index on those two columns for looking up orders by customer.Surely, the whole point of a checksum is that you create an index just on the checksum, so a
SELECT will retrieve the records that match the checksum, and then compare the underlying value as a safety check? Doesn't putting the underlying value in the index waste a bunch of space for no real gain?Solution
You are right, this is pointless.
Two (of many) reasons that I see it's wrong
-
it isn't guaranteed unique (CHECKSUM gives int) whereas the GUID is (over the range of GUID). It's a small chance of duplicate but quite possible: like the "birthday problem" somewhat
-
it's still random order. The main reason IDENTITY is better then GUID for a clustered index is that IDENTITY is monotonically increasing.
I'd add a new IDENTITY column, and then start changing dependencies to use this only.
Two (of many) reasons that I see it's wrong
-
it isn't guaranteed unique (CHECKSUM gives int) whereas the GUID is (over the range of GUID). It's a small chance of duplicate but quite possible: like the "birthday problem" somewhat
-
it's still random order. The main reason IDENTITY is better then GUID for a clustered index is that IDENTITY is monotonically increasing.
CHECKSUM(someGUID) is random order tooI'd add a new IDENTITY column, and then start changing dependencies to use this only.
Context
StackExchange Database Administrators Q#3756, answer score: 7
Revisions (0)
No revisions yet.