patternsqlMinor
Is there a performance penalty when there are two unique indexes on a table with all of the colums in one index are also in the other index?
Viewed 0 times
uniquethepenaltyallarewithcolumsotherindexesalso
Problem
In my database I have a table with two indexes. There is one unique index #1 on columns a,b which I need for data integrity reasons. I have another index #2 with columns c,a,b that is used for performance reasons. I noticed this index #2 is also unique.
The uniqueness of index #2 seems to me to be redundant since you can't have duplicate values in index #2 without also having duplicate values in index #1. I am tempted to change index #2 so it is no longer unique, because I imagine the database engine might perform a second check on c,a,b in index #2 to ensure uniqueness on these columns every time a row is inserted, causing a performance hit, even though there can never be duplicate values. Is this correct?
Is there any way to delete the index #1 on a,b and keep the index #2 on c,a,b but still force a unique constraint on just columns a,b without maintaining two separate indexes? That would allow me to only have one index with all three columns but still enforce my data integrity constraint on a,b. I don't need an index on a,b for performance because all my select queries include column c in the where clause. Would this be a use case for a unique constraint instead of an index? I thought that the database engine basically treats these two constructs the same (see this post: When should I use a unique constraint instead of a unique index?).
Keep in mind the indexes are not redundant, but the "uniqueness" of the indexes is redundant. Seems like it is a no-brainier to make index #2 non-unique. But does this result in any actual performance gain? Does the database check the uniqueness of both indexes even though the columns in index #1 are entirely included in index #2?
Some answers asked about example queries that are used to select data from this table. Here are the most common ones:
Th
The uniqueness of index #2 seems to me to be redundant since you can't have duplicate values in index #2 without also having duplicate values in index #1. I am tempted to change index #2 so it is no longer unique, because I imagine the database engine might perform a second check on c,a,b in index #2 to ensure uniqueness on these columns every time a row is inserted, causing a performance hit, even though there can never be duplicate values. Is this correct?
Is there any way to delete the index #1 on a,b and keep the index #2 on c,a,b but still force a unique constraint on just columns a,b without maintaining two separate indexes? That would allow me to only have one index with all three columns but still enforce my data integrity constraint on a,b. I don't need an index on a,b for performance because all my select queries include column c in the where clause. Would this be a use case for a unique constraint instead of an index? I thought that the database engine basically treats these two constructs the same (see this post: When should I use a unique constraint instead of a unique index?).
Keep in mind the indexes are not redundant, but the "uniqueness" of the indexes is redundant. Seems like it is a no-brainier to make index #2 non-unique. But does this result in any actual performance gain? Does the database check the uniqueness of both indexes even though the columns in index #1 are entirely included in index #2?
Some answers asked about example queries that are used to select data from this table. Here are the most common ones:
Select [some other columns] from table where c=1 and a=2
Select [some other columns] from table where c=1
Select [some other columns] from table where c=1 and a=2 and b=3Th
Solution
The uniqueness of index #2 seems to me to be redundant since you can't have duplicate values in index #2 without also having duplicate values in index #1.
Yes, declaring index #2 unique is redundant. Index #1 needs to exist to support the foreign key, so there's little chance of it being dropped.
I am tempted to change index #2 so it is no longer unique, because I imagine the database engine might perform a second check on c,a,b in index #2 to ensure uniqueness on these columns every time a row is inserted, causing a performance hit, even though there can never be duplicate values.
This is the wrong thing to focus on. SQL Server always needs to locate the insertion point. If it finds an existing row for the supplied key(s) and the index is marked unique, an error is raised. Otherwise, the new row is inserted at the correct point.
There is a much more important consequence of marking an index unique unnecessarily. Any update that changes one of the key columns requires special handling to avoid transient key violations during update processing. In SQL Server, this means extra Split, Sort, and Collapse operators and wide (per-index) maintenance for the affected index.
This is a lot more expensive that the issue you were concerned about.
Is there any way to delete the index #1 on a,b and keep the index #2 on c,a,b but still force a unique constraint on just columns a,b without maintaining two separate indexes?
No.
I don't need an index on a,b for performance because all my select queries include column c in the where clause.
The example queries suggest the equality predicate on column c alone is selective enough that the optimizer chooses a nonclustered index seek plus lookup plan. The optimizer is cautious about selecting this type of plan, which implies column c is very selective indeed (or estimates are poor).
Since you're retrieving columns outside the index anyway, you might well find a nonclustered non-unique index on column c alone is sufficient.
Adding a and/or b to the index key would be useful if enough queries benefit from the more precise index seek to justify the extra space and maintenance. Only you have enough information to make this judgement call.
Yes, declaring index #2 unique is redundant. Index #1 needs to exist to support the foreign key, so there's little chance of it being dropped.
I am tempted to change index #2 so it is no longer unique, because I imagine the database engine might perform a second check on c,a,b in index #2 to ensure uniqueness on these columns every time a row is inserted, causing a performance hit, even though there can never be duplicate values.
This is the wrong thing to focus on. SQL Server always needs to locate the insertion point. If it finds an existing row for the supplied key(s) and the index is marked unique, an error is raised. Otherwise, the new row is inserted at the correct point.
There is a much more important consequence of marking an index unique unnecessarily. Any update that changes one of the key columns requires special handling to avoid transient key violations during update processing. In SQL Server, this means extra Split, Sort, and Collapse operators and wide (per-index) maintenance for the affected index.
This is a lot more expensive that the issue you were concerned about.
Is there any way to delete the index #1 on a,b and keep the index #2 on c,a,b but still force a unique constraint on just columns a,b without maintaining two separate indexes?
No.
I don't need an index on a,b for performance because all my select queries include column c in the where clause.
The example queries suggest the equality predicate on column c alone is selective enough that the optimizer chooses a nonclustered index seek plus lookup plan. The optimizer is cautious about selecting this type of plan, which implies column c is very selective indeed (or estimates are poor).
Since you're retrieving columns outside the index anyway, you might well find a nonclustered non-unique index on column c alone is sufficient.
Adding a and/or b to the index key would be useful if enough queries benefit from the more precise index seek to justify the extra space and maintenance. Only you have enough information to make this judgement call.
Context
StackExchange Database Administrators Q#317762, answer score: 8
Revisions (0)
No revisions yet.