HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlModerate

Index Uniqueness Overhead

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
overheadindexuniqueness

Problem

I've been having an ongoing debate with various developers in my office on the cost of an index, and whether or not uniqueness is beneficial or costly (probably both). The crux of the issue is our competing resources.

Background

I have previously read a discussion that stated a Unique index is no additional cost to maintain, since an Insert operation implicitly checks for where it fits into the B-tree, and, if a duplicate is found in a non-unique index, appends a uniquifier to the end of the key, but otherwise inserts directly. In this sequence of events, a Unique index has no additional cost.

My coworker combats this statement by saying that Unique is enforced as a second operation after the seek to the new position in the B-tree, and thus is more costly to maintain than a non-unique index.

At worst, I have seen tables with an identity column (inherently unique) that is the clustering key of the table, but explicitly stated as non-unique. On the other side of worst is my obsession with uniqueness, and all indexes are created as unique, and when not possible to define an explicitly unique relation to an index, I append the PK of the table to the end of the index to ensure the uniqueness is guaranteed.

I'm frequently involved in code reviews for the dev team, and I need to be able to give general guidelines for them to follow. Yes, every index should be evaluated, but when you have five servers with thousands of tables each and as many as twenty indexes on a table, you need to be able to apply some simple rules to ensure a certain level of quality.

Question

Does uniqueness have an additional cost on the back-end of an Insert compared to the cost of maintaining a non-unique index? Secondly, what is wrong with appending the Primary Key of a table to the end of an index to ensure uniqueness?

Example Table Definition

```
create table #test_index
(
id int not null identity(1, 1),
dt datetime not null default(current_timestamp),
val varch

Solution

I'm frequently involved in code reviews for the dev team, and I need to be able to give general guidelines for them to follow.

The environment I'm currently involved in has 250 servers with 2500 databases. I've worked on systems with 30,000 databases. Guidelines for indexing should revolve around the naming convention, etc, not be "rules" for what columns to include in an index - every individual index should be engineered to be the correct index for that specific business rule or code touching the table.


Does uniqueness have an additional cost on the back-end of an Insert compared to the cost of maintaining a non-unique index? Secondly, what is wrong with appending the Primary Key of a table to the end of an index to ensure uniqueness?

Adding the primary key column to the end of a non-unique index to make it unique looks to me to be an anti-pattern. If business rules dictate the data should be unique, then add a unique constraint to the column; which will automatically create a unique index. If you're indexing a column for performance, why would you add a column to the index?

Even if your supposition that enforcing uniqueness doesn't add any extra overhead is correct (which it isn't for certain cases), what are you solving by needlessly complicating the index?

In the specific instance of adding the primary key to the end of your index key so that you can make the index definition include the UNIQUE modifier, it actually makes zero difference to the physical index structure on disk. This is due to the nature of the structure of B-tree indexes keys, in that they always need to be unique.

As David Browne mentioned in a comment:



Since every nonclustered index is stored as unique index, there is no extra cost in inserting into a unique index. In fact the only extra cost would in failing to declare a candidate key as a unique index, which would cause the clustered index keys to be appended to the index keys.


Take the following minimally complete and verifiable example:

USE tempdb;

DROP TABLE IF EXISTS dbo.IndexTest;
CREATE TABLE dbo.IndexTest
(
    id int NOT NULL
        CONSTRAINT IndexTest_pk
        PRIMARY KEY
        CLUSTERED
        IDENTITY(1,1)
    , rowDate datetime NOT NULL
);


I'll add two indexes that are identical except for the addition of the primary key at the tail end of the second indexes key definition:

CREATE INDEX IndexTest_rowDate_ix01
ON dbo.IndexTest(rowDate);

CREATE UNIQUE INDEX IndexTest_rowDate_ix02
ON dbo.IndexTest(rowDate, id);


Next, we'll several rows to the table:

INSERT INTO dbo.IndexTest (rowDate)
VALUES (DATEADD(SECOND, 0, GETDATE()))
     , (DATEADD(SECOND, 0, GETDATE()))
     , (DATEADD(SECOND, 0, GETDATE()))
     , (DATEADD(SECOND, 1, GETDATE()))
     , (DATEADD(SECOND, 2, GETDATE()));


As you can see above, three rows contain the same value for the rowDate column, and two rows contain unique values.

Next, we'll look at the physical page structures for each index, using the undocumented DBCC PAGE command:

DECLARE @dbid int = DB_ID();
DECLARE @fileid int;
DECLARE @pageid int;
DECLARE @indexid int;

SELECT @fileid = ddpa.allocated_page_file_id
    , @pageid = ddpa.allocated_page_page_id
FROM sys.indexes i 
CROSS APPLY sys.dm_db_database_page_allocations(DB_ID(), i.object_id, i.index_id, NULL, 'LIMITED') ddpa
WHERE i.name = N'IndexTest_rowDate_ix01'
    AND ddpa.is_allocated = 1
    AND ddpa.is_iam_page = 0;

PRINT N'*************************************** IndexTest_rowDate_ix01 *****************************************';
DBCC TRACEON(3604);
DBCC PAGE (@dbid, @fileid, @pageid, 1);
DBCC TRACEON(3604);
PRINT N'*************************************** IndexTest_rowDate_ix01 *****************************************';

SELECT @fileid = ddpa.allocated_page_file_id
    , @pageid = ddpa.allocated_page_page_id
FROM sys.indexes i 
CROSS APPLY sys.dm_db_database_page_allocations(DB_ID(), i.object_id, i.index_id, NULL, 'LIMITED') ddpa
WHERE i.name = N'IndexTest_rowDate_ix02'
    AND ddpa.is_allocated = 1
    AND ddpa.is_iam_page = 0;

PRINT N'*************************************** IndexTest_rowDate_ix02 *****************************************';
DBCC TRACEON(3604);
DBCC PAGE (@dbid, @fileid, @pageid, 1);
DBCC TRACEON(3604);
PRINT N'*************************************** IndexTest_rowDate_ix02 *****************************************';


I've looked at the output using Beyond Compare, and except for obvious differences around the allocation page IDs, etc, the two index structures are identical.

You might take the above to mean that including the primary key in every index, and defining at as unique is A Good Thing™ since that's what happens under-the-covers anyway. I wouldn't make that assumption, and would suggest only defining an index as unique if in fact the natural data in the index is unique already.

There are several excellent resources in the Interwebz about this topic, including:

  • Where Clustered

Code Snippets

USE tempdb;

DROP TABLE IF EXISTS dbo.IndexTest;
CREATE TABLE dbo.IndexTest
(
    id int NOT NULL
        CONSTRAINT IndexTest_pk
        PRIMARY KEY
        CLUSTERED
        IDENTITY(1,1)
    , rowDate datetime NOT NULL
);
CREATE INDEX IndexTest_rowDate_ix01
ON dbo.IndexTest(rowDate);

CREATE UNIQUE INDEX IndexTest_rowDate_ix02
ON dbo.IndexTest(rowDate, id);
INSERT INTO dbo.IndexTest (rowDate)
VALUES (DATEADD(SECOND, 0, GETDATE()))
     , (DATEADD(SECOND, 0, GETDATE()))
     , (DATEADD(SECOND, 0, GETDATE()))
     , (DATEADD(SECOND, 1, GETDATE()))
     , (DATEADD(SECOND, 2, GETDATE()));
DECLARE @dbid int = DB_ID();
DECLARE @fileid int;
DECLARE @pageid int;
DECLARE @indexid int;

SELECT @fileid = ddpa.allocated_page_file_id
    , @pageid = ddpa.allocated_page_page_id
FROM sys.indexes i 
CROSS APPLY sys.dm_db_database_page_allocations(DB_ID(), i.object_id, i.index_id, NULL, 'LIMITED') ddpa
WHERE i.name = N'IndexTest_rowDate_ix01'
    AND ddpa.is_allocated = 1
    AND ddpa.is_iam_page = 0;

PRINT N'*************************************** IndexTest_rowDate_ix01 *****************************************';
DBCC TRACEON(3604);
DBCC PAGE (@dbid, @fileid, @pageid, 1);
DBCC TRACEON(3604);
PRINT N'*************************************** IndexTest_rowDate_ix01 *****************************************';

SELECT @fileid = ddpa.allocated_page_file_id
    , @pageid = ddpa.allocated_page_page_id
FROM sys.indexes i 
CROSS APPLY sys.dm_db_database_page_allocations(DB_ID(), i.object_id, i.index_id, NULL, 'LIMITED') ddpa
WHERE i.name = N'IndexTest_rowDate_ix02'
    AND ddpa.is_allocated = 1
    AND ddpa.is_iam_page = 0;

PRINT N'*************************************** IndexTest_rowDate_ix02 *****************************************';
DBCC TRACEON(3604);
DBCC PAGE (@dbid, @fileid, @pageid, 1);
DBCC TRACEON(3604);
PRINT N'*************************************** IndexTest_rowDate_ix02 *****************************************';

Context

StackExchange Database Administrators Q#242908, answer score: 16

Revisions (0)

No revisions yet.