principleModerate
What are the performance considerations between using a broad PK vs a separate synthetic key and UQ?
Viewed 0 times
considerationsthewhatareseparatebetweenperformanceusingbroadand
Problem
I have several tables where records can be uniquely identified with several broad business fields. In the past, I've used these fields as a PK, with these benefits in mind:
However, I've heard a case made for creating a synthetic
If a table has no indices other than the PK, I don't see any reason to favor the second approach, though in a large table it's probably best to assume that indices may be necessary in the future, and therefore favor the narrow synthetic PK. Am I missing any considerations?
Incidentally, I'm not arguing against using synthetic keys in data warehouses, I'm just interested in when to use a single broad PK and when to use a narrow PK plus a broad UK.
- Simplicity; there are no extraneous fields and just one index
- Clustering allows for fast merge joins and range-based filters
However, I've heard a case made for creating a synthetic
IDENTITY INT PK, and instead enforcing the business key with a separate UNIQUE constraint. The advantage is that the narrow PK makes for much smaller secondary indices.If a table has no indices other than the PK, I don't see any reason to favor the second approach, though in a large table it's probably best to assume that indices may be necessary in the future, and therefore favor the narrow synthetic PK. Am I missing any considerations?
Incidentally, I'm not arguing against using synthetic keys in data warehouses, I'm just interested in when to use a single broad PK and when to use a narrow PK plus a broad UK.
Solution
There is no significant disadvantage using the natural key as the clustered index
The downside would be increased page splits as data inserts would be distributed throughout the data, instead of at the end.
Where you do have FKs or NC indexes, the using a narrow, numeric, increasing clustered index has advantages. You only repeat a few bytes of data per NC or FK entry, not the while business/natural key.
As to why, read the too 5 articles from Google
Note I avoided the use of "primary key".
You can have the clustered index on the surrogate key but keep the PK on the business rules but as non-clustered. Just make sure the clustered is unique becauuse SQL will add a "uniquifier" to make it so.
Finally, it may make sense to have a surrogate key but not blindly on every table: many-many tables do not need one, or where a compound key from the parent tables will suffice
- there are no non-clustered indexes
- no foreign keys referencing this table (it is a parent row)
The downside would be increased page splits as data inserts would be distributed throughout the data, instead of at the end.
Where you do have FKs or NC indexes, the using a narrow, numeric, increasing clustered index has advantages. You only repeat a few bytes of data per NC or FK entry, not the while business/natural key.
As to why, read the too 5 articles from Google
Note I avoided the use of "primary key".
You can have the clustered index on the surrogate key but keep the PK on the business rules but as non-clustered. Just make sure the clustered is unique becauuse SQL will add a "uniquifier" to make it so.
Finally, it may make sense to have a surrogate key but not blindly on every table: many-many tables do not need one, or where a compound key from the parent tables will suffice
Context
StackExchange Database Administrators Q#6468, answer score: 11
Revisions (0)
No revisions yet.