patternMinor

Why do temporal tables log the begin time of the transaction?

Submitted by: @import:stackexchange-dba·Mar 10, 2026·

Viewed 0 times

temporalwhythetableslogtimetransactionbegin

Problem

When updating a row in a temporal table the old values for the row are stored in the history table with the transaction begin time as the SysEndTime. The new values in the current table will have the transaction begin time as the SysStartTime.

SysStartTime and SysEndTime are datetime2 columns used by temporal tables to record when a row was the current version. Transaction begin time is the time the transaction containing the updates started.

BOL says:

The times recorded in the system datetime2 columns are based on the
begin time of the transaction itself. For example, all rows inserted
within a single transaction will have the same UTC time recorded in
the column corresponding to the start of the SYSTEM_TIME period.

Example: I start updating all the rows in my Orders table at 20160707 11:00:00 and the transaction takes 5 minutes to run. This creates a row in the history table for each row with SysEndTime as 20160707 11:00:00. All the rows in the current table will have a SysStartTime of 20160707 11:00:00.

If someone were to execute a query at 20160707 11:01:00 (while the update is running) they would see the old values (assuming default read committed isolation level).

But if someone was to then use the AS OF syntax to query the temporal table as it was at 20160707 11:01:00 they would see the new values because the their SysStartTime would be 20160707 11:00:00.

To me this means it doesn't show those rows as they were at that time. If it used the transaction end time the problem wouldn't exist.

Questions: Is this by design? Am I missing something?

The only reason I can think it's using the transaction begin time is that it is the only 'known' when the transaction starts. It doesn't know when the transaction will end when it starts and it would take time to apply the end time at the end which would invalidate the end time it was applying. Does this make sense?

This should allow you to recreate the issue.

Solution

I believe that this is indeed a design flaw, albeit one that is not specific to SQL Server 2016, as all other existing implementations of temporal tables (as far as I know) have the same flaw. The problems that can arise with temporal tables because of this are fairly severe; the scenario in your example is mild compared to what can go wrong in general:

Broken foreign key references: Suppose we have two temporal tables, with table A having a foreign key reference to table B. Now let's say we have two transactions, both running at a READ COMMITTED isolation level: transaction 1 begins before transaction 2, transaction 2 inserts a row into table B and commits, then transaction 1 inserts a row in table A with a reference to the newly added row of B. Since the addition of the new row to B was already committed, the foreign key constraint is satisfied and transaction 1 is able to commit successfully. However, if we were to view the database "AS OF" some time in between when transaction 1 began and when transaction 2 began, then we would see table A with a reference to a row of B that does not exist. So in this case, the temporal table provides an inconsistent view of the database. This of course was not the intent of the SQL:2011 standard, which states,

Historical system rows in a system-versioned table form immutable
snapshots of the past. Any constraints that were in effect when a
historical system row was created would have already been checked when
that row was a current system row, so there is never any need to
enforce constraints on historical system rows.

Non-unique primary keys: Let's say we have a table with a primary key and two transactions, both at a READ COMMITTED isolation level, in which the following happens: After transaction 1 begins but before it touches this table, transaction 2 deletes a certain row of the table and commits. Then, transaction 1 inserts a new row with the same primary key as the one that was deleted. This goes through fine, but when you look at the table AS OF a time in between when transaction 1 began and when transaction 2 began, we'll see two rows with the same primary key.

Errors on concurrent updates: Let's say we have a table and two transactions that both update the same row in it, again at a READ COMMITTED isolation level. Transaction 1 begins first, but transaction 2 is the first to update the row. Transaction 2 then commits, and transaction 1 then does a different update on the row and commits. This is all fine, except that if this is a temporal table, upon execution of the update in transaction 1 when the system goes to insert the required row into the history table the generated SysStartTime will be the start time of transaction 2, while the SysEndTime will be the start time of transaction 1, which is not a valid time interval since the SysEndTime would be before the SysStartTime. In this case SQL Server throws an error and rolls back the transaction (e.g., see this discussion). This is very unpleasant, since at the READ COMMITTED isolation level it would not be expected that concurrency issues would lead to outright failures, which means that applications are not necessarily going to be prepared to make retry attempts. In particular, this is contrary to a "guarantee" in Microsoft's documentation:

This behavior guarantees that your legacy applications will continue
to work when you enable system-versioning on tables that will benefit
from versioning. (link)

Other implementations of temporal tables have dealt with this scenario (two concurrent transactions updating the same row) by offering an option to automatically "adjust" the timestamps if they are invalid (see here and here). This is an ugly workaround, as it has the unfortunate consequence of breaking the atomicity of transactions, since other statements within the same transactions will not generally have their timestamps adjusted in the same way; i.e., with this workaround, if we view the database "AS OF" certain points in time then we may see partially-executed transactions.

Solution: You've already suggested the obvious solution, which is for the implementation to use the transaction end time (i.e. the commit time) instead of the start time. Yes it is true that when we're executing a statement in the middle of a transaction, it is impossible to know what the commit time will be (as it is in the future, or might not even exist if the transaction were to be rolled back). But this doesn't mean the solution is unimplementable; it just has to be done a different way. For example, when performing an UPDATE or DELETE statement, in creating the history row the system could just put in the current transaction ID instead of a start time, and then the ID can be converted to a timestamp later by the system after the transaction commits. There is no need to go into an infinite regression of then recording the time that the timestamp was filled in or anything like that.

In the context of th

Context

StackExchange Database Administrators Q#143241, answer score: 4

Revisions (0)

No revisions yet.