patternsqlMinor
Very slow simple PostgreSQL query on RDS
Viewed 0 times
postgresqlsimplequeryslowveryrds
Problem
I seem to be getting very slow queries on a medium sized RDS box (db.m3.medium, 3.7gb ram).
This is across a table of 4,152,928 rows..
Total runtime: 45031 ms.
Locally, I have around 1.1million rows, and the same query takes about 450ms..
Here's the query plan, from explain:
Here's the response from explain analyze:
Total runtime: 42419.772 ms
For reference, 141559 rows are part of the sum().
The current indexes I have are:
Edit: Seems to be resolved by upping
This is across a table of 4,152,928 rows..
select sum(some_field) c
from pages
where pages.some_id=123
and pages.first_action_at > '2014-01-01 00:00:00 +1000'Total runtime: 45031 ms.
Locally, I have around 1.1million rows, and the same query takes about 450ms..
Here's the query plan, from explain:
Aggregate (cost=475640.59..475640.60 rows=1 width=4)
-> Seq Scan on pages (cost=0.00..475266.07 rows=149809 width=4)
Filter: ((first_action_at > '2014-01-01 00:00:00'::timestamp without time zone)
AND (some_id = 447))Here's the response from explain analyze:
Aggregate (cost=475641.74..475641.76 rows=1 width=4) (actual time=42419.717..42419.718 rows=1 loops=1)
-> Seq Scan on pages (cost=0.00..475267.22 rows=149810 width=4) (actual time=0.013..42265.908 rows=141559 loops=1)
Filter: ((first_action_at > '2014-01-01 00:00:00'::timestamp without time zone) AND (some_id = 447))
Rows Removed by Filter: 4011369Total runtime: 42419.772 ms
For reference, 141559 rows are part of the sum().
The current indexes I have are:
:some_id
:some_id, :first_action_atwork_mem was previously set to 1 mb (RDS default). I've just changed this to 18 mb.Edit: Seems to be resolved by upping
work_mem as well as added the second index above, speed is now around 800 ms.Solution
Matching index
After re-reading your question I realized you are not running Amazon Redshift, but Amazon RDS, which seems to be running unsullied Postgres, at least according to the documentation:
Amazon RDS supports DB instances running several versions of
PostgreSQL. Currently we support PostgreSQL versions 9.3.1, 9.3.2, and 9.3.3.
This would mean you have index-only scans at your disposal. If you meet some preconditions (basically if
Note that
Multicolumn index and performance
If you don't see "index-only scan" in
(Like you have now, according to your question update.)
Either way, another index on just
Is a composite index also good for queries on the first field?
Server configuration
All the usual advice for slow queries and proper server configuration applies and a
memory to be used by internal sort operations and hash tables.
Neither applies here.
After re-reading your question I realized you are not running Amazon Redshift, but Amazon RDS, which seems to be running unsullied Postgres, at least according to the documentation:
Amazon RDS supports DB instances running several versions of
PostgreSQL. Currently we support PostgreSQL versions 9.3.1, 9.3.2, and 9.3.3.
This would mean you have index-only scans at your disposal. If you meet some preconditions (basically if
vacuum can keep up with write operations) and if some_field is not updated to often and reasonably small (which seems to be the case for a numeric column), the perfect index would include some_field in last position (like @zerkms first mentioned):CREATE INDEX ON pages(some_id, first_action_at, some_field);Note that
some_id should come before first_action_at, because it is typically more efficient to have columns with equality checks first and ranges later. Details:Multicolumn index and performance
If you don't see "index-only scan" in
EXPLAIN ANALYZE, the last column is just ballast and better left away:CREATE INDEX ON pages(some_id, first_action_at);(Like you have now, according to your question update.)
Either way, another index on just
(some_id) only offers very little over this multicolumn index:Is a composite index also good for queries on the first field?
Server configuration
All the usual advice for slow queries and proper server configuration applies and a
work_mem setting of 1 MB is much too low for a DB with millions of rows. But this particular setting should not be crucial for this particular query, since work_mem is (per documentation):memory to be used by internal sort operations and hash tables.
Neither applies here.
Code Snippets
CREATE INDEX ON pages(some_id, first_action_at, some_field);CREATE INDEX ON pages(some_id, first_action_at);Context
StackExchange Database Administrators Q#69835, answer score: 8
Revisions (0)
No revisions yet.