patternsqlMinor
PostgreSQL strategies to deal with disk filling up
Viewed 0 times
postgresqldiskwithfillingstrategiesdeal
Problem
I'm using PostgreSQL (8.4) to store data produced by an application making frequent inserts (in the table structure described below).
The database keeps growing with time and, since the newer data is more relevant than the older data (in this particular application), deleting the older rows is a reasonable solution (either based on lower
To prevent issues related from this database (the only database running on this server) from affecting the rest of the system, I've put the PostgreSQL data directory on its own partition (ext3, on a Linux system). Nevertheless, when this partition becomes full, this causes a number of problems.
I'm thinking of deleting older data regularly (e.g.
Firstly, my understanding of
If so, it also appears that
In addition, when it's too late and the partition is filled at 100%, running this
PANIC: could not write to file "pg_xlog/xlogtemp.7810": No space left on device
The PostgreSQL daemon stopping is of course a major issue (and there is no other disk to move the cluster to on this machine).
Are there general s
The database keeps growing with time and, since the newer data is more relevant than the older data (in this particular application), deleting the older rows is a reasonable solution (either based on lower
id or older input_datetime, which is more or less the same).To prevent issues related from this database (the only database running on this server) from affecting the rest of the system, I've put the PostgreSQL data directory on its own partition (ext3, on a Linux system). Nevertheless, when this partition becomes full, this causes a number of problems.
I'm thinking of deleting older data regularly (e.g.
DELETE FROM data_group WHERE id <= ... via a cron job) to deal with this.Firstly, my understanding of
VACUUM (as performed by auto-vacuum, which is on) is that, while it doesn't necessarily give back the disk space to the OS (like VACUUM FULL would), it still allows some new data to be inserted within the disk space already used (that is, the DELETEs don't necessarily affect the file size, but they still free space in PostgreSQL's own data structures). Is this correct? (I've noticed VACUUM FULL caused a few problems with the application itself, probably because of the locks it uses.)If so, it also appears that
SELECT pg_database_size('my_database') reflects the size used on disk, which doesn't necessarily reflect what's available for further inserts. Is there another way to estimate how much space is available for new inserts?In addition, when it's too late and the partition is filled at 100%, running this
DELETE statement causes this error and crashes the PostgreSQL service:PANIC: could not write to file "pg_xlog/xlogtemp.7810": No space left on device
The PostgreSQL daemon stopping is of course a major issue (and there is no other disk to move the cluster to on this machine).
Are there general s
Solution
On one hand, you can have a look at one of my previous answers to see how you can keep a table size more or less steady. There you will find a solution with triggers - of course, this can be solved using a cron job as well. In the latter case I would first check if the row number exceeded a certain limit and the either delete the oldest rows or drop a partition.
On the other hand, as you already noticed, one has to take care of the disk space where
There will always be at least one WAL segment file, and will normally
not be more than
files. Each segment file is normally 16 MB (though this size can be
altered when building the server). You can use this to estimate space
requirements for WAL. Ordinarily, when old log segment files are no
longer needed, they are recycled (renamed to become the next segments
in the numbered sequence). If, due to a short-term peak of log output
rate, there are more than
the unneeded segment files will be deleted instead of recycled until
the system gets back under this limit.
If you don't have replication set up, the maximum is
On the other hand, as you already noticed, one has to take care of the disk space where
pg_xlog is. When it gets full, it is not that easy to recover... But checking your database settings you can have a fair estimation how much space you need:There will always be at least one WAL segment file, and will normally
not be more than
(2 + checkpoint_completion_target) *
checkpoint_segments + 1 or checkpoint_segments + wal_keep_segments + 1files. Each segment file is normally 16 MB (though this size can be
altered when building the server). You can use this to estimate space
requirements for WAL. Ordinarily, when old log segment files are no
longer needed, they are recycled (renamed to become the next segments
in the numbered sequence). If, due to a short-term peak of log output
rate, there are more than
3 * checkpoint_segments + 1 segment files,the unneeded segment files will be deleted instead of recycled until
the system gets back under this limit.
If you don't have replication set up, the maximum is
3 * checkpoint_segments + 1 (times 16 MB). A typical replicationless setup will need something under 10 GB for pg_xlog, I think.Context
StackExchange Database Administrators Q#46858, answer score: 4
Revisions (0)
No revisions yet.