HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Is harmfulness of idle in transaction connections a myth?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
idlemythtransactionharmfulnessconnections

Problem

There are some sources on the internet which insist idle in transaction connections may prevent vacuum from cleaning up dead tuples, below are some examples:

User Guide for Aurora:

A transaction in the idle in transaction state can hold locks that block other queries. It can also prevent VACUUM (including autovacuum) from cleaning up dead rows, leading to index or table bloat or transaction ID wraparound.

Cybertec blog

A long transaction is actually not a problem – the problem starts if a long transaction and many small changes have to exist. Remember: The long transaction can cause VACUUM to not clean out your dead rows.

Actually, there are plenty of them, however from my perspective that sounds absolutely ridiculous: in the most cases transaction isolation level is read committed, that in turn means there is no need to keep dead tuples for such transactions, moreover, I have found alternative opinion on that topic:

It is not really long-lived transactions, but long lived snapshots. Certainly a long running select or insert statement will do that. For isolation levels higher than read-committed, the whole transaction will retain the snapshot until it is down, so if some opens a repeatable read transaction and then goes on vacation without committing it, that would be a problem. Hung-up prepared transactions will as well (if you don't know what a prepared transaction is, then you probably aren't using them).

or Pavel Luzanov's comment under Cybertec blogpost:

I believe that example of a long transaction is true only for Repeatable Read (or Serializable) isolation level. But by default BEGIN used Read Commited. So, after SELECT in the first session finished, VACUUM will remove dead rows in a table after subsequent UPDATE, DELETE commands in the session 2.

which is actually confirmed by @Bill Karwin in his answer (thanks!)

The question is: are there "valid" "non-fictional" scenarios when idle in transaction connections should be considered harmful? (I'm

Solution

It is true that a transaction itself does not block the progress of VACUUM. A transaction only blocks VACUUM if one of these two conditions are satisfied:

-
The transaction has a transaction ID assigned (that is, it has modified something in that database).

-
The transaction holds a snapshot of the database. A snapshot is a data structure that determines which other transactions are visible to a certain transactions. Snapshots are held open

-
as long as an SQL statement is running (so a long running query can block VACUUM progress)

-
while there is a cursor open

-
on the REPEATABLE READ or SERIALIZABLE isolation level, for the whole duration of the transaction

You can use the query from my article on the topic to see if there is a transaction that blocks VACUUM's progress:

SELECT pid, datname, usename, state, backend_xmin, backend_xid
FROM pg_stat_activity
      /* holds a snapshot */
WHERE backend_xmin IS NOT NULL
      /* has a transaction ID */
   OR backend_xid IS NOT NULL
ORDER BY greatest(age(backend_xmin), age(backend_xid)) DESC;

Code Snippets

SELECT pid, datname, usename, state, backend_xmin, backend_xid
FROM pg_stat_activity
      /* holds a snapshot */
WHERE backend_xmin IS NOT NULL
      /* has a transaction ID */
   OR backend_xid IS NOT NULL
ORDER BY greatest(age(backend_xmin), age(backend_xid)) DESC;

Context

StackExchange Database Administrators Q#332402, answer score: 5

Revisions (0)

No revisions yet.