patternsqlMinor

Postgres select on a very large table

Submitted by: @import:stackexchange-dba·Mar 10, 2026·

Viewed 0 times

postgreslargeveryselecttable

Problem

We've a very large table with more than 2.2 billion rows at present on Postgres 12.5. The total size of the table (including index) stands at 500 GB. There is one query that we need to do in order to find a set of valid rows from the data set and do updates on them. The query looks something like this:

select id, col4 from table where col1=$1 and col2=$2 and col3='f' and col4>0 order by col5 limit 10

To serve this query, there is an index on the table ON (col1, col2, col5) and the query uses this index. So far so good. The problem arises when the database needs to do a lot of disk seeks when there's a miss from buffer. This leads to the queries waiting on DataFileRead.

Up until now we were using a 16 vCPU and 128 GB machine with io1 storage type with a provisioned IOPS of 20000 for hosting (it's hosted on AWS RDS). We started with a provisioned IOPS of about 3000 and kept increasing it with hopes that with an aggressive autovacuum and data localization it will stabilize at some value. The autovacuum is configured in such a way that it runs every couple of days on this table. We recently faced an issue where the read IOPS started hitting 20000 and the application got too slow. We upgraded to a larger machine with exactly double the size since we could no longer provision IOPS more than 20000 on the prior machine.

On the larger machine we're observing that even the read IOPS has now fallen to ~5000 and the machine now consumes overall IOPS of around 6000 at peak times and the query time has halved precisely. This certainly has to do with the higher shared_buffer now available for postgres to keep the hot referenced rows in cache, we're assuming.

The problem is that the machine which we're now using is running at ~5% CPU load and there's 184 GB of RAM still unused. All in all, this machine is heavily underutilized. We want to be using the smaller machine by doing any changes in parameters so that this query can run under some tolerable latency limit.

Solution

You are currently removing 855 rows using a filter on col3 and col4, in order to find 10 rows which pass that filter. So as I feared, the things that fail that filter might be rarer than other things, but they are sitting right in the way. And the next time you need 10 more things, they will still be in the way. And the next time. Not only are you doing 85 times more work than you need to for every execution, you are hitting ~85 times more pages. If that same thing happens for every other combination of col1 and col2, then no wonder you keep running out cache space and IOPS. And of course there is no reason for it to stop there, you could have far more than 850 accumulate in the way, if you have nothing to get rid of them.

You could use a partial index to avoid visiting those rows each time:

create index on t (col1, col2, col5) where col3='f' and col4>0;

Alternatively, each time col3 turns true or col4 turns 0, you could just delete the row, and (possibly) insert it into some history table if you need to keep some record of it.

Code Snippets

create index on t (col1, col2, col5) where col3='f' and col4>0;

Context

StackExchange Database Administrators Q#298963, answer score: 4

Revisions (0)

No revisions yet.