patternsqlMinor
Speed up Postgres Query that has index but still slow
Viewed 0 times
postgresstillquerybutslowhasthatindexspeed
Problem
Have roughly 50 million rows in a table called 'messages'. The problem is even with indexes particular queries run very slow. My background is more Mysql so I'm more of a newb on Postgres (9.5). Any suggestion or help to solve an issue with slow queries on this table would be greatly appreciated.
The table is structured as so:
```
mydatabase=# \d+ messages
id | integer | not null default nextval('messages_id_seq'::regclass) | plain | |
conversation_id | integer | | plain | |
user_id | integer | | plain | |
content | text | | extended | |
attached_photos | text | | extended | |
attached_video | text | | extended | |
status | character varying(255) | | extended | |
created_at | timestamp without time zone | not null | plain | |
updated_at | timestamp without time zone | not null | plain | |
attached_audio | text | | extended | |
sticker_url | character varying(255) | | extended | |
link_preview | text | | extended | |
flash_duration | integer |
The table is structured as so:
```
mydatabase=# \d+ messages
id | integer | not null default nextval('messages_id_seq'::regclass) | plain | |
conversation_id | integer | | plain | |
user_id | integer | | plain | |
content | text | | extended | |
attached_photos | text | | extended | |
attached_video | text | | extended | |
status | character varying(255) | | extended | |
created_at | timestamp without time zone | not null | plain | |
updated_at | timestamp without time zone | not null | plain | |
attached_audio | text | | extended | |
sticker_url | character varying(255) | | extended | |
link_preview | text | | extended | |
flash_duration | integer |
Solution
Emulating your setup
I think your index is badly maintained. That's about all I can come up with. I tried and emulated everything with 50 M rows.
From there, the only modification I made was to change your placeholder to 'a'.. IE,
to this,
You can see the query plan here, the very same.
My assumption is that you either
For reference, my whole index is
Solution
Try
Alternative plan
As I suggested in the comments too rewriting to an IN statement provides a slightly faster plan. If that saves you here, I think it's more luck though. I think you've got other problems.
```
QUERY PLAN
I think your index is badly maintained. That's about all I can come up with. I tried and emulated everything with 50 M rows.
CREATE TABLE messages
AS
SELECT
x::int AS conversation_id,
((x::int%26)+97)::"char" AS status
FROM generate_series(1,50e6) AS gs(x);
CREATE INDEX ON messages (conversation_id , status);
ANALYZE messages;From there, the only modification I made was to change your placeholder to 'a'.. IE,
AND "messages"."status" = $1;to this,
AND "messages"."status" = 'a';You can see the query plan here, the very same.
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=14.05..1734.48 rows=955000 width=4) (actual time=0.755..6.384 rows=31 loops=1)
-> HashAggregate (cost=13.49..15.49 rows=200 width=4) (actual time=0.689..1.007 rows=899 loops=1)
Group Key: "*VALUES*".column1
-> Values Scan on "*VALUES*" (cost=0.00..11.24 rows=899 width=4) (actual time=0.002..0.282 rows=899 loops=1)
-> Index Only Scan using messages_conversation_id_status_idx on messages (cost=0.56..8.58 rows=1 width=4) (actual time=0.006..0.006 rows=0 loops=899)
Index Cond: ((conversation_id = "*VALUES*".column1) AND (status = 'a'::"char"))
Heap Fetches: 31
Planning time: 1.197 ms
Execution time: 6.463 ms
(9 rows)My assumption is that you either
- have lots of dead rows, or a badly maintained index.
- have a taxed out CPU, or taxed out ram.
For reference, my whole index is
1071 MB which fits into memory. If you can't fit it into memory you may be slightly slower but even if you have to page the index you can not account for that kind of slow down unless you have major IO bottlenecks -- it's only 1 GB.Solution
Try
VACUUM FULL messages.Alternative plan
As I suggested in the comments too rewriting to an IN statement provides a slightly faster plan. If that saves you here, I think it's more luck though. I think you've got other problems.
```
QUERY PLAN
Code Snippets
CREATE TABLE messages
AS
SELECT
x::int AS conversation_id,
((x::int%26)+97)::"char" AS status
FROM generate_series(1,50e6) AS gs(x);
CREATE INDEX ON messages (conversation_id , status);
ANALYZE messages;AND "messages"."status" = $1;AND "messages"."status" = 'a';QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=14.05..1734.48 rows=955000 width=4) (actual time=0.755..6.384 rows=31 loops=1)
-> HashAggregate (cost=13.49..15.49 rows=200 width=4) (actual time=0.689..1.007 rows=899 loops=1)
Group Key: "*VALUES*".column1
-> Values Scan on "*VALUES*" (cost=0.00..11.24 rows=899 width=4) (actual time=0.002..0.282 rows=899 loops=1)
-> Index Only Scan using messages_conversation_id_status_idx on messages (cost=0.56..8.58 rows=1 width=4) (actual time=0.006..0.006 rows=0 loops=899)
Index Cond: ((conversation_id = "*VALUES*".column1) AND (status = 'a'::"char"))
Heap Fetches: 31
Planning time: 1.197 ms
Execution time: 6.463 ms
(9 rows)QUERY PLAN Context
StackExchange Database Administrators Q#177225, answer score: 5
Revisions (0)
No revisions yet.