Did adding JSONB indexes bloat the database?

Problem

We are running:

user@host:~$ psql -d database -c "SELECT version();"                                                                   
version
---------------------------------------------------------------------------------------------------------------------------------------------
 PostgreSQL 10.7 (Ubuntu 10.7-1.pgdg16.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609, 64-bit
(1 row)

on:

user@host:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 16.04.6 LTS
Release:    16.04
Codename:   xenial

Solution

I don't think the archeological approach will be very useful here. There is just too much missing info and confounding variables. For example, people usually don't just add indexes for no reason. If a change in work load motivated the index creation, it could be the change in workload, independent of the index, which is causing the bloat.

There are a lot of theories that explain what you see, but really no way to distinguish between them based on the history given. Each index gives vacuum more work to do, so your new indexes might have just pushed it over the tipping point just because it was close already, without regard to what the content of the indexes are. Or maybe a lot of work piled up while the table was locked for index creation, and then the frenzy of activity once the lock was released pushed it over the edge. It is isn't just more indexes that create more work for vacuum--bloat does as well. This can lead to a vicious cycle where more bloat slows down the vacuum, leading to more bloat yet. That could be the reason environment 1 stabilized after the VACUUM FULL, it broke the vicious cycle to the point regular vacuums can now keep up. If environment 2 is less powerful, the tipping point might be lower and the VACUUM FULL there might leave it still beyond that point.

autovacuum_analyze_scale_factor     | 0.002                 | f
 autovacuum_analyze_threshold        | 10                    | f
 autovacuum_vacuum_scale_factor      | 0.001                 | f
 autovacuum_vacuum_threshold         | 25                    | f

These settings seem rather ridiculous at first glance. Is there a rationale for them? It could be spending so much time vacuuming and analyzing tables that don't really need them, that it can't keep up on the table that does need vacuuming (but if you have only one large table, that might not be much of a concern). Lowering the scale factors can make sense, but generally only in conjunction with an increase of the thresholds.

I routinely set "vacuum_cost_page_hit" to zero "vacuum_cost_page_miss" to zero. In my experience, the concurrent performance problems caused by autovac are generally caused by the writing, not the reading, so there is no point in throttling on the read side. This could be especially important when you have tables and indexes which are already bloated, as then you can have much more reading than writing going on.

The output from setting log_autovacuum_min_duration=0 could help distinguish between the various theories. Also, using pg_freespacemap to see how much sum(avail) PostgreSQL thinks the table has, while it is in a state of bloat, can be informative.

Code Snippets

autovacuum_analyze_scale_factor     | 0.002                 | f
 autovacuum_analyze_threshold        | 10                    | f
 autovacuum_vacuum_scale_factor      | 0.001                 | f
 autovacuum_vacuum_threshold         | 25                    | f

Context

StackExchange Database Administrators Q#237846, answer score: 3

Revisions (0)

No revisions yet.