patternsqlMajor
Poor performance on query with LIMIT when I add an ORDER BY?
Viewed 0 times
orderwithlimitquerypoorperformancewhenadd
Problem
Given the table:
Column
Type
Modifiers
Storage
id
bigint
not null default nextval('items_id_seq'::regclass)
plain
data
text
not null
extended
object_id
bigint
not null
plain
and indexes:
When I execute:
It returns 0 rows very quickly. However, when I execute this query with
What explains this discrepancy?
Query Plans
Fast Query (without
Slow query (with the
Column
Type
Modifiers
Storage
id
bigint
not null default nextval('items_id_seq'::regclass)
plain
data
text
not null
extended
object_id
bigint
not null
plain
and indexes:
- "items_pkey" PRIMARY KEY, btree (id)
- "items_object_id_idx" btree (object_id)
When I execute:
SELECT *
FROM items
WHERE object_id = 123
LIMIT 1;It returns 0 rows very quickly. However, when I execute this query with
ORDER BY, it hangs for a very long time:SELECT *
FROM items
WHERE object_id = 123
ORDER BY id DESC -- I added the ORDER BY
LIMIT 1;What explains this discrepancy?
Query Plans
Fast Query (without
ORDER BY)QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.56..3.34 rows=1 width=63) (actual time=0.014..0.014 rows=0 loops=1)
-> Index Scan using items_object_id_operation_idx on items (cost=0.56..2579.16 rows=929 width=63) (actual time=0.013..0.013 rows=0 loops=1)
Index Cond: (object_id = 123::bigint)
Total runtime: 0.029 msSlow query (with the
ORDER BY)QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.44..1269.14 rows=1 width=63) (actual time=873796.061..873796.061 rows=0 loops=1)
-> Index Scan Backward using items_pkey on items (cost=0.44..1164670.11 rows=918 width=63) (actual time=873796.059..873796.059 rows=0 loops=1)
Filter: (object_id = 123::bigint)
Rows Removed by Filter: 27942522
Total runtime: 873796.113 msSolution
Trying to explain why there is a difference in performance between the two queries.
This one:
The alternative:
The alternative chosen by the optimizer, presumably based on the
The presence of an index on
This one:
SELECT * FROM "items" WHERE "object_id" = '123' LIMIT 1 is satisfied by any one row with the matching object_id, so the index on object_id is a natural choice. The query requires minimal I/O: an index scan to find the first matching value, plus one heap read to fetch the entire row.The alternative:
SELECT * FROM "items" WHERE "object_id" = '123' ORDER BY "id" DESC LIMIT 1 requires all rows with the matching object_id to be sorted by another column, id, then the row with the maximum value of id to be returned. If you were to use the index on object_id, you would need to perform the following operations: scan the index to find every matching object_id; for every match go fetch the actual row; then sort all fetched rows by id and return the one with the largest id.The alternative chosen by the optimizer, presumably based on the
object_id histogram, is: scan the index on id backwards, in its entirety; for each value go fetch the row and check if the value of object_id matches; return the first matching row, which will have the maximum possible id value. This alternative avoids sorting the rows, so I guess the optimizer prefers it to using the index on object_id.The presence of an index on
(object_id asc, id desc) allows for yet another alternative: scan this new index for the first entry matching the supplied object_id value, which by definition will have the highest id value; go fetch one matching row and return. Obviously, this is the most efficient approach.Context
StackExchange Database Administrators Q#110636, answer score: 22
Revisions (0)
No revisions yet.