patternsqlMinor
Slow Query with LIMIT
Viewed 0 times
withlimitqueryslow
Problem
We have this query:
It runs fine under 0.1ms. But when we add LIMIT:
It runs over 20,000ms. Some notes:
So we isolated each WHERE condition and found out about a certain behaviour -- we tweaked the LIMIT to correspond with the amount of unread messages:
It runs under 0.100ms. Any explanation for the difference in performance?
SELECT "messages".*
FROM "messages"
WHERE "messages"."account_id" = 1
AND "messages"."status" = 'unread'
AND (created_at >= '2014-04-24 01:00:00.000000')
ORDER BY id DESCIt runs fine under 0.1ms. But when we add LIMIT:
SELECT "messages".*
FROM "messages"
WHERE "messages"."account_id" = 1
AND "messages"."status" = 'unread'
AND (created_at >= '2014-04-24 01:00:00.000000')
ORDER BY id DESC
LIMIT 5It runs over 20,000ms. Some notes:
- The messages table has over 300,000 rows, not that big.
- We have indexes on all 3 columns in the WHERE condition.
- The account_id and created_at conditions filter out 3,000 messages from the 300,000 rows. Out of the 3,000 messages in that account_id, there are only 3 unread messages.
So we isolated each WHERE condition and found out about a certain behaviour -- we tweaked the LIMIT to correspond with the amount of unread messages:
SELECT "messages".*
FROM "messages"
WHERE "messages"."account_id" = 1
AND "messages"."status" = 'unread'
AND (created_at >= '2014-04-24 01:00:00.000000')
ORDER BY id DESC
LIMIT 3It runs under 0.100ms. Any explanation for the difference in performance?
Solution
Adding
In particular, increasing the
Details:
Aside from that, a (partial) multi column index may help. Depends on the complete situation. Which of your
For example:
Run
LIMIT to the query often makes Postgres choose a different query plan. If statistics or cost estimates stray too far from actual data distribution / actual costs, you may end up with a slower query, even though Postgres reckoned it would be faster. Irregular data distribution may cause the query planner to misjudge the selectivity of WHERE conditions.In particular, increasing the
STATISTICS target for status, account_id, created_at and id might help.Details:
- How can I avoid PostgreSQL sometimes choosing a bad query plan for one of two nearly identical queries?
Aside from that, a (partial) multi column index may help. Depends on the complete situation. Which of your
WHERE conditions are constant? How can they vary? Is some additional maintenance cost and disk space acceptable for a special index?For example:
CREATE INDEX messages_special_idx ON messages (account_id, created_at, id)
WHERE status = 'unread'
AND created_at >= '2014-01-01 01:00:00';Run
ANALYZE messages; before you try your queries again.Code Snippets
CREATE INDEX messages_special_idx ON messages (account_id, created_at, id)
WHERE status = 'unread'
AND created_at >= '2014-01-01 01:00:00';Context
StackExchange Database Administrators Q#80988, answer score: 2
Revisions (0)
No revisions yet.