HiveBrain v1.2.0
Get Started
← Back to all entries
patternModerate

What are the best practices regarding lookup tables in relational databases?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
tablesthedatabaseswhatareregardingrelationalpracticeslookupbest

Problem

Lookup tables (or code tables, as some people call them) are usually a collection of the possible values that can be given for a certain column.

For example, suppose we have a lookup table called party (meant to store information about political parties) that has two columns:

  • party_code_idn, which holds system-generated numeric values, and (lacking business domain meaning) works as a surrogate for the real key.



  • party_code, is the real or “natural” key of the table because it maintains values that have business domain connotations.



And let us say that such table retains the data that follows:

+----------------+------------+
 | party_code_idn | party_code |
 +----------------+------------+
 |              1 | Republican |
 |              2 | Democratic |
 +----------------+------------+


The party_code column, which keeps the values 'Republican' and 'Democratic', being the real key of the table, is set up with a UNIQUE constraint, but I optionally added the party_code_idn and defined it as the PK of the table (though, logically speaking, party_code may work as the PRIMARY KEY [PK]).
Question

What are the best practices for pointing to lookup values from transaction tables? Should I establish FOREIGN KEY (FK) references either (a) directly to the natural and meaningful value or (b) to surrogate value?

Option (a), for example,

+---------------+------------+---------+
 | candidate_idn | party_code |  city   |
 +---------------+------------+---------+
 |             1 | Democratic | Alaska  |
 |             2 | Republican | Memphis |
 +---------------+------------+---------+


has the following properties1:

  • Readable for the end user (+)



  • Easy to import-export across systems (+)



  • Difficult to change the value as it needs modification in all referring tables (-)



  • Adding new value is not costly (=)



I think it is almost like “pass by value”, to draw an analogy from function call in application programming jargon.

Option (b), f

Solution

By IDN, I take it you mean an IDENTITY, SEQUENCE or AUTO_INCREMENT field? You should take a look here and here.

Note, section 5 (Misusing Data values as Data Elements) of the first reference, underneath figure 10

Of course you can have a separate table for the sales persons and then
reference it using a foreign key, preferably with a simple surrogate
key such as sales_person_id , shown above.

So, this expert thinks that you should "deference" surrogate keys. It is really quite a basic SQL technique and shouldn't cause problems in your day-to-day SQL. It appears that there is an error in figure 10 - the sales_person in SalesData should be a surrogate key (i.e. a number), not text. I'm inferring this from the quote above.

What you should avoid at all costs is the temptation (very common for novice database programmers) to commit the error outlined in section (1) Common Lookup Tables. This is commonly called the MUCK (Massively Unified Code Key) approach (not by accident :-) notably by Joe Celko, also sarcasticlly known as the OTLT - One True Lookup Table) and leads to all sorts of difficulties. Novice programmers appear to feel that a single code/lookup/whatever table is "cleaner" and will be more efficient when nothing could be further from the truth.

From the second reference above:

Normalization eliminates redundant data, thus making the task of
enforcing data integrity vastly simpler, but the process of creating a
MUCK is something else entirely.MUCK's do not eliminate redundant
data, rather they are an elimination of what are PERCEIVED to be
redundant tables, but as I will demonstrate, fewer tables does not
equal simplicity.

You might also want to take a look at the related EAV (Entity Attribute Value) paradigm which I deal with here.

Context

StackExchange Database Administrators Q#142825, answer score: 13

Revisions (0)

No revisions yet.