patternMajor
Should every table have a single-field surrogate/artificial primary key?
Viewed 0 times
fieldprimarytableartificialeverysinglesurrogateshouldhavekey
Problem
I understand one benefit of surrogate/artificial keys in general - they do not change and that can be very convenient. This is true whether they are single or multiple field - as long as they are 'artificial'.
However, it sometimes seems to be a matter of policy to have an auto-incrementing integer field as the primary key of each table. Is this always the best idea to have such a single-field key and why (or why not)?
To be clear, this question is not about artificial vs natural - but about whether all artificial keys should be single-field
However, it sometimes seems to be a matter of policy to have an auto-incrementing integer field as the primary key of each table. Is this always the best idea to have such a single-field key and why (or why not)?
To be clear, this question is not about artificial vs natural - but about whether all artificial keys should be single-field
Solution
I'm going to say no, not always, but most of the time yes..
These are some circumstances in which you don't need a surrogate or artificial key:
intersection being the target of a foreign key and if there is little
or no risk of the intersection attracting independent attributes
(i.e. something other than FK's to the two parent tables) then you
can get away with using the combination of FKs as the PK with fair
confidence.
table with a unique business key which is fixed externally to your
business and which has zero chance of ever changing for any
practical purpose, then using the business key directly can make
things simpler. An example might be a list of state or province
codes or a list of ANSI standard numbers, etc.
sources. If your system has many sources of data that must be
shoehorned together into a single table, say at head office, then
sometimes you need a compound key that includes the source system key
value and a code indicating what the source system was.
There are also some situations where the old-faithful monotonically increasing integer surrogate key is not ideal. You can have keys that are alphanumeric surrogates. These could include:
sources. To avoid key collisions you might use GUIDs instead of
IDENTITY keys.
representations. Let's say you've got a license plate database.
Your key could be the alphanumeric value instead of a pure number.
compression to your key value. Instead of using 10 digits for an
int32 you can use six base 36 digits.
Why most of the time yes? The most fundamental answer to that question is that it is pure hell if you ever need to modify a primary key value on any table. Since almost anything a user can see or touch is conceivably subject to an update at some point, using a visible key value is inviting pure hell. Using a surrogate key will keep you from falling into this trap.
Having said that, remember that there is room for YAGNI in applying this concept. You don't need to go forcing code tables with IDENTITY keys into every nook and cranny of your schema, just in case someone decides that the symbol for male gender in your employee table needs to change from M to X or something silly.
These are some circumstances in which you don't need a surrogate or artificial key:
- Pure intersection tables. If there is no risk of the
intersection being the target of a foreign key and if there is little
or no risk of the intersection attracting independent attributes
(i.e. something other than FK's to the two parent tables) then you
can get away with using the combination of FKs as the PK with fair
confidence.
- Lookup tables with static business keys. If you have a lookup
table with a unique business key which is fixed externally to your
business and which has zero chance of ever changing for any
practical purpose, then using the business key directly can make
things simpler. An example might be a list of state or province
codes or a list of ANSI standard numbers, etc.
- Tables containing data consolidated from multiple, independent
sources. If your system has many sources of data that must be
shoehorned together into a single table, say at head office, then
sometimes you need a compound key that includes the source system key
value and a code indicating what the source system was.
There are also some situations where the old-faithful monotonically increasing integer surrogate key is not ideal. You can have keys that are alphanumeric surrogates. These could include:
- Situations where you need to merge data from multiple, independent
sources. To avoid key collisions you might use GUIDs instead of
IDENTITY keys.
- Situations where you are forced to use non-numeric key
representations. Let's say you've got a license plate database.
Your key could be the alphanumeric value instead of a pure number.
- Situations where some external requirement forces you to apply
compression to your key value. Instead of using 10 digits for an
int32 you can use six base 36 digits.
Why most of the time yes? The most fundamental answer to that question is that it is pure hell if you ever need to modify a primary key value on any table. Since almost anything a user can see or touch is conceivably subject to an update at some point, using a visible key value is inviting pure hell. Using a surrogate key will keep you from falling into this trap.
Having said that, remember that there is room for YAGNI in applying this concept. You don't need to go forcing code tables with IDENTITY keys into every nook and cranny of your schema, just in case someone decides that the symbol for male gender in your employee table needs to change from M to X or something silly.
Context
StackExchange Database Administrators Q#6108, answer score: 30
Revisions (0)
No revisions yet.