patternsqlMinor
What is the most efficient way to count the number of rows in a table?
Viewed 0 times
rowsthenumberwhattableefficientwaycountmost
Problem
I am using Postgres with the following query:
The primary key on this table is non-incrementing; it's a unique serial number for the images stored in the table. Our app often attempts to ingest images that have already been recorded in the database, so the primary key/serial number ensures they are only recorded once.
Now we are wondering if we should have gone with an incrementing primary key instead. We have 1,259,369 images in the database and it takes about 7 minutes for the count query to run.
Our app will never delete images from this table - so an incrementing primary key would allow us to check the value of the last ID which would equal the number of rows in the table.
select count(*) from image;The primary key on this table is non-incrementing; it's a unique serial number for the images stored in the table. Our app often attempts to ingest images that have already been recorded in the database, so the primary key/serial number ensures they are only recorded once.
Now we are wondering if we should have gone with an incrementing primary key instead. We have 1,259,369 images in the database and it takes about 7 minutes for the count query to run.
Our app will never delete images from this table - so an incrementing primary key would allow us to check the value of the last ID which would equal the number of rows in the table.
Solution
Generally, if you don't need an exact count, there is a much faster way:
As a matter of fact, in a DB with concurrent write access every count is an estimate, because the number may be outdated the instant you get it.
But, like @a_horse commented, there is something off in your DB. Counting a million should not take more than a few seconds in the worst case.
That your
Check for dead tuples:
All the usual advice for performance optimization applies.
SELECT reltuples::bigint AS estimate
FROM pg_class
WHERE oid = 'image'::regclass;- Fast way to discover the row count of a table
As a matter of fact, in a DB with concurrent write access every count is an estimate, because the number may be outdated the instant you get it.
But, like @a_horse commented, there is something off in your DB. Counting a million should not take more than a few seconds in the worst case.
That your
app will never delete images from this table makes this even more suspicious, because there shouldn't be many dead rows then. (Or are you updating a lot?) A huge amount of dead tuples could slow you down - and call for VACUUM. Normally, autovacuum takes care of this. Did you enable it? (It's the default in modern Postgres.)- Are regular VACUUM ANALYZE still recommended under 9.1?
Check for dead tuples:
- Measure the size of a PostgreSQL table row
All the usual advice for performance optimization applies.
Code Snippets
SELECT reltuples::bigint AS estimate
FROM pg_class
WHERE oid = 'image'::regclass;Context
StackExchange Database Administrators Q#95449, answer score: 8
Revisions (0)
No revisions yet.