patternMinor
What is the primary key on a temporal table?
Viewed 0 times
temporaltheprimarywhattablekey
Problem
If I use the following table design to capture history...:
... then what should the primary key be?
(The history mechanism is
I'm thinking
CREATE TABLE MyTable (
insertion_timestamp TIMESTAMP,
deleted_flag BOOLEAN,
natural_key INT,
attribute VARCHAR
);... then what should the primary key be?
(The history mechanism is
INSERT-only: updated rows are inserted with a different insertion_timestamp, and deleted rows are inserted with a different timestamp and the deleted_flag set to true.)I'm thinking
PRIMARY KEY (insertion_timestamp, deleted_flag, natural_key), but the only reason for including deleted_flag is to account for the possibility of a row being inserted and then (soft) deleted immediately, sooner than the next tick of the granularity of the TIMESTAMP data type. This feels overly paranoid...Solution
I'm thinking
I don't know if it's too paranoid, but it's reasonable to just stipulate that no two rows may share the same
So my recommendation would be
Also, you should seriously consider the following additions:
PRIMARY KEY (insertion_timestamp, deleted_flag, natural_key), but the only reason for including deleted_flag is to account for the possibility of a row being inserted and then (soft) deleted immediately, sooner than the next tick of the granularity of the TIMESTAMP data type. This feels overly paranoid...I don't know if it's too paranoid, but it's reasonable to just stipulate that no two rows may share the same
insertion_timestamp and natural_key, and make it the responsibility of the clients doing the inserts to handle the edge case. It simplifies using the table considerably if you can assume that those combinations are unique—plus in the tripartite key you suggested, it's tricky to order the rows in the triply-paranoid case where there are two immediate "soft" deletes in a row.So my recommendation would be
PRIMARY KEY (natural_key, insertion_timestamp) (in that order—but I'm guessing natural_key is going to be queried for more often, which could be wrong).Also, you should seriously consider the following additions:
- Add an
inactive_timestampcolumn to the tables. This allows you to query for rows that were active at a point in time.
- Think carefully about what the timestamps mean. Is the table supposed to represent the history of the real-world values ("Joe's phone number was 555-5555 until January 22, then it was 666-6666 from January 22"), or history of modifications to an "eternal" value? Your schema looks like it's going after the latter, but make sure you understand which is going to be needed.
Context
StackExchange Database Administrators Q#20426, answer score: 3
Revisions (0)
No revisions yet.