HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

PostgreSQL unexplained table bloat

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
postgresqlbloattableunexplained

Problem

I have a table in a Postgres 8.2.15 database. The table bloated to almost 25GB but after running vacuum full and cluster the table size was dramatically smaller, well under 1GB. A few weeks later and it's back up to 3.5GB and climbing.

This is not a table that has frequent deletes, so I'm at a loss as to what is causing the bloat.

This only happens to a single table. I have a separate, structurally identical database, serving the same software; the table in that database has not shown any bloat.

Any ideas?

Solution

There are a least three major reasons for why you should upgrade to a more recent version, preferably to the current version.

-
As mentioned by @a_horse_with_no_name the autovacuum mechanism has been improved in many places since version 8.2.

-
You don't see frequent DELETEs, so the table bloat most probably comes from UPDATEs as @Milen commented. A new feature has been introduced with version 8.3, that counters this problem at its root: Heap-Only Tuples, I quote the release notes:


Heap-Only Tuples (HOT) accelerate space reuse for most UPDATEs and
DELETEs (Pavan Deolasee, with ideas from many others)


UPDATEs and DELETEs leave dead tuples behind, as do failed INSERTs.
Previously only VACUUM could reclaim space taken by dead tuples. With
HOT dead tuple space can be automatically reclaimed at the time of
INSERT or UPDATE if no changes are made to indexed columns. This
allows for more consistent performance. Also, HOT avoids adding
duplicate index entries.

Emphasis mine.

-
PostgreSQL 8.2 has reached end of life in Dec. 2011. No more security updates. Upgrade to a more recent version as soon as possible.

To find out whether there are, in fact, a lot of UPDATEs launched, use the configuration parameter log_statement and check your database log. Maybe a misconfigured app or a trigger launches a lot more UPDATEs than you think?

If you SET the parameter in a session it only changes for this session:

SET log_statement = 'all';


So, rather change it in your postgresql.conf:

log_statement = all


and reload. For instance with pg_ctl reload. But don't forget to change it back (and reload) or your log files may grow huge. Then check the log files to see what's actually going on.

Code Snippets

SET log_statement = 'all';
log_statement = all

Context

StackExchange Database Administrators Q#12150, answer score: 9

Revisions (0)

No revisions yet.