patternsqlModerate
What fillfactor for caching table?
Viewed 0 times
whatcachingfillfactorfortable
Problem
I have heavily updated / accessed table where I store serialized java objects. They are in the table for 2-3 hours (also are being updated during that period) and then removed. Size of table is around 300MB. I have spotted it is very, very often VACUUMed and wonder if changing the
fillfactor would help?Solution
The key words here are:
Point 1. is indication for a lower fill factor, while 2. is the opposite. It helps performance if multiple row versions are stored on the same data page. H.O.T. updates would achieve that. Read here or here. They need some wiggle room on the data page - like dead tuples or space reserved by a
If the tuple size is 4 kb or more, reducing the fill factor would be futile, since there can never be more than one tuple on a data page. You might as well leave it at
Whatever you do,
The standard form of
indexes and marks the space available for future reuse.
Bold emphasis mine.
You can play with per-table settings for autovacuum to trigger it less (or more) often for this table only:
The default thresholds and scale factors are taken from
Bold emphasis mine. In particular with
Alternatives
All this aside, since your data seem to be volatile to begin with: use an
Data written to unlogged tables is not written to the write-ahead log
(see Chapter 29), which makes them considerably faster than
ordinary tables. However, they are not crash-safe: an unlogged
table is automatically truncated after a crash or unclean shutdown.
The contents of an unlogged table are also not replicated to standby servers.
Bold emphasis mine. Don't use this if your server might crash and you still need the data afterwards. But if we are talking about session data for web applications, this might be an acceptable price to pay.
Or, even more radical: Use a key-value store like Redis if you can do without the features and security provided by an RDBMS altogether.
- "heavily updated"
- "in the table for 2-3 hours".
Point 1. is indication for a lower fill factor, while 2. is the opposite. It helps performance if multiple row versions are stored on the same data page. H.O.T. updates would achieve that. Read here or here. They need some wiggle room on the data page - like dead tuples or space reserved by a
fillfactor If the tuple size is 4 kb or more, reducing the fill factor would be futile, since there can never be more than one tuple on a data page. You might as well leave it at
100 (which is the default anyway). However, some data types are "toasted" and stored out-of-line if they exceed a size limit, so tuples requiring that much in the main relation fork are rare.Whatever you do,
VACUUM will be run often. And that's generally a good thing, I wouldn't worry about that. You create lots of dead tuples. VACUUM identifies dead rows that are not visible to any open transaction any more. The manual:The standard form of
VACUUM removes dead row versions in tables andindexes and marks the space available for future reuse.
Bold emphasis mine.
You can play with per-table settings for autovacuum to trigger it less (or more) often for this table only:
The default thresholds and scale factors are taken from
postgresql.conf, but it is possible to override them on a table-by-table basis;Bold emphasis mine. In particular with
autovacuum_vacuum_threshold and autovacuum_vacuum_scale_factor. Running VACUUM a lot might actually be a good idea, instead of a very low fillfacter. That depends on access patterns. If all tuples live, say, 3 hours and each is updated several times, I would still lower the fillfactor to something like 50. You'll have to test and find the sweet spot.Alternatives
All this aside, since your data seem to be volatile to begin with: use an
UNLOGGED table:Data written to unlogged tables is not written to the write-ahead log
(see Chapter 29), which makes them considerably faster than
ordinary tables. However, they are not crash-safe: an unlogged
table is automatically truncated after a crash or unclean shutdown.
The contents of an unlogged table are also not replicated to standby servers.
Bold emphasis mine. Don't use this if your server might crash and you still need the data afterwards. But if we are talking about session data for web applications, this might be an acceptable price to pay.
Or, even more radical: Use a key-value store like Redis if you can do without the features and security provided by an RDBMS altogether.
Context
StackExchange Database Administrators Q#36383, answer score: 18
Revisions (0)
No revisions yet.