patternsqlMajor
How does unique key help to improve SQL query performance?
Viewed 0 times
uniquesqlqueryhelpimproveperformancedoeshowkey
Problem
Suppose I have a table with an
I mean does a unique constraint help to improve query performance or does it just force the user to maintain the record uniquely, so that SQL will get higher selectivity on the unique constraint column?
I don't know much about unique constraint and unique index. The answer given on this question helps in understanding the difference between UniqueIndex and UniqueKey. I want to know more about unique key and how it can improve my query performance.
id column and and I am maintaining unique values manually while inserting and updating records instead of creating a unique key on that column (e.g. by using a date & time for the id column value, such as 201505061648). I am not using any index or key in my table. So, does it affect query performance when I execute a select query with a where clause even if the cardinality is 100% for the id column?select * from myTable where id=15I mean does a unique constraint help to improve query performance or does it just force the user to maintain the record uniquely, so that SQL will get higher selectivity on the unique constraint column?
I don't know much about unique constraint and unique index. The answer given on this question helps in understanding the difference between UniqueIndex and UniqueKey. I want to know more about unique key and how it can improve my query performance.
Solution
Since the Stack Overflow post you linked to explains how the unique key constraint is implemented using a unique index, so the two are the same for practical purposes, I'll skip that part and try to explain how a unique index improves query performance.
Let's stick to your example where table named
Now someone comes and issues a query like
The optimizer must decide how to find all the rows satisfying the condition id = 15. It may know, from the statistics, that values in the column are unique (num distinct = num rows in the table), however, that doesn't provide any information on where the row with id = 15 is physically located on the disk. So, the database has no choice but to scan the entire table to find the matching row. It can't even stop once it finds the first row with id = 15, since there's no guarantee that there aren't more such rows.
Now, we create a unique index on column
Things are radically different now. The optimizer now knows that there are only two possibilities: either there is exactly one row that satisfies the condition, or there are no such rows. So, a cheap index seek is all that is needed to find the required row if it exists; the same index seek will return no results if there are no such rows. So, the query will run fast.
Suppose there is also a (normal, non-unique) index on
The optimizer can now choose between using the index on id, index on anotherCol, and a table scan. It will choose the most selective method, because it requires the least work to get to the results - so it will use the unique index, again.
In general, whenever your query has where clause predicates that are joined with AND (a common case) and any of the columns in the where clause has a unique index on it, the optimizer will chose that index as the access method and the query will run fast.
The only case when the unique index will not be used is when it cannot be used, for example when a query like
is issued. Since the condition between the predicates is OR, the fact that we can find the row where id = 15 fast doesn't mean much since we still have to find other rows where val = 100. If val were not indexed, a table scan would be the only solution (rendering unique index seek useless, as the row where id = 15 would be picked up by the table scan anyway); since val is indexed, the optimizer might opt to use both indexes to find rows satisfying each condition individually and then concatenate the results.
Hopefully this makes things a bit clearer to you.
Let's stick to your example where table named
myTable has a column named id which happens to be filled with unique values, but no unique index on that column. Let's also suppose that the table has another column, anotherCol. Now someone comes and issues a query like
select * from myTable where id = 15The optimizer must decide how to find all the rows satisfying the condition id = 15. It may know, from the statistics, that values in the column are unique (num distinct = num rows in the table), however, that doesn't provide any information on where the row with id = 15 is physically located on the disk. So, the database has no choice but to scan the entire table to find the matching row. It can't even stop once it finds the first row with id = 15, since there's no guarantee that there aren't more such rows.
Now, we create a unique index on column
id, and repeat the same query. Things are radically different now. The optimizer now knows that there are only two possibilities: either there is exactly one row that satisfies the condition, or there are no such rows. So, a cheap index seek is all that is needed to find the required row if it exists; the same index seek will return no results if there are no such rows. So, the query will run fast.
Suppose there is also a (normal, non-unique) index on
anotherCol, and you issue a query likeselect * from myTable where id = 15 and anotherCol = 100The optimizer can now choose between using the index on id, index on anotherCol, and a table scan. It will choose the most selective method, because it requires the least work to get to the results - so it will use the unique index, again.
In general, whenever your query has where clause predicates that are joined with AND (a common case) and any of the columns in the where clause has a unique index on it, the optimizer will chose that index as the access method and the query will run fast.
The only case when the unique index will not be used is when it cannot be used, for example when a query like
select * from myTable where id = 15 or val = 100is issued. Since the condition between the predicates is OR, the fact that we can find the row where id = 15 fast doesn't mean much since we still have to find other rows where val = 100. If val were not indexed, a table scan would be the only solution (rendering unique index seek useless, as the row where id = 15 would be picked up by the table scan anyway); since val is indexed, the optimizer might opt to use both indexes to find rows satisfying each condition individually and then concatenate the results.
Hopefully this makes things a bit clearer to you.
Code Snippets
select * from myTable where id = 15select * from myTable where id = 15 and anotherCol = 100select * from myTable where id = 15 or val = 100Context
StackExchange Database Administrators Q#100749, answer score: 21
Revisions (0)
No revisions yet.