HiveBrain v1.2.0
Get Started
← Back to all entries
patternMajor

Should every table have a single-field surrogate/artificial primary key?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
fieldprimarytableartificialeverysinglesurrogateshouldhavekey

Problem

I understand one benefit of surrogate/artificial keys in general - they do not change and that can be very convenient. This is true whether they are single or multiple field - as long as they are 'artificial'.

However, it sometimes seems to be a matter of policy to have an auto-incrementing integer field as the primary key of each table. Is this always the best idea to have such a single-field key and why (or why not)?

To be clear, this question is not about artificial vs natural - but about whether all artificial keys should be single-field

Solution

I'm going to say no, not always, but most of the time yes..

These are some circumstances in which you don't need a surrogate or artificial key:

  • Pure intersection tables. If there is no risk of the


intersection being the target of a foreign key and if there is little
or no risk of the intersection attracting independent attributes
(i.e. something other than FK's to the two parent tables) then you
can get away with using the combination of FKs as the PK with fair
confidence.

  • Lookup tables with static business keys. If you have a lookup



table with a unique business key which is fixed externally to your

business and which has zero chance of ever changing for any

practical purpose, then using the business key directly can make

things simpler. An example might be a list of state or province

codes or a list of ANSI standard numbers, etc.

  • Tables containing data consolidated from multiple, independent


sources. If your system has many sources of data that must be
shoehorned together into a single table, say at head office, then
sometimes you need a compound key that includes the source system key
value and a code indicating what the source system was.

There are also some situations where the old-faithful monotonically increasing integer surrogate key is not ideal. You can have keys that are alphanumeric surrogates. These could include:

  • Situations where you need to merge data from multiple, independent


sources. To avoid key collisions you might use GUIDs instead of
IDENTITY keys.

  • Situations where you are forced to use non-numeric key


representations. Let's say you've got a license plate database.
Your key could be the alphanumeric value instead of a pure number.

  • Situations where some external requirement forces you to apply


compression to your key value. Instead of using 10 digits for an
int32 you can use six base 36 digits.

Why most of the time yes? The most fundamental answer to that question is that it is pure hell if you ever need to modify a primary key value on any table. Since almost anything a user can see or touch is conceivably subject to an update at some point, using a visible key value is inviting pure hell. Using a surrogate key will keep you from falling into this trap.

Having said that, remember that there is room for YAGNI in applying this concept. You don't need to go forcing code tables with IDENTITY keys into every nook and cranny of your schema, just in case someone decides that the symbol for male gender in your employee table needs to change from M to X or something silly.

Context

StackExchange Database Administrators Q#6108, answer score: 30

Revisions (0)

No revisions yet.