patternsqlMinor
Postgres 12.1 uses Index Only Scan Backward instead of index when LIMIT is present
Viewed 0 times
presentscanpostgreslimitbackwardusesinsteadwhenindexonly
Problem
I have a medium sized table (~4M rows) of "functionCalls" which consists of 2 columns,
I want to find all rows that match a certain function, which is why I added the
This takes forever (currently ~ 20s) because pg refuses to use the index, and decides to do a
```
Limit (cost=0.71..2178.97 rows=25 width=4) (actual time=12903.294..19142.568 rows=8 loops=1)
Output: c.input
Buffers: shared hit=59914 read=26193 written=54
-> Nested Loop (cost=0.71..135662.48 rows=1557 width=4) (actual time=12903.292..19142.561 rows=8 loops=1)
Output: c.input
Inner Unique: true
Join Filter: (c.function = function.id)
Rows Removed by Join Filter: 3649900
Buffers: shared hit=59914 read=26193 written=54
-> Index Only Scan Backward using "functionCall_pkey" on public."functionCall" c (cost=0.43..80906.80 rows=3650225 width=8) (actual time=0.040..17083.489 rows=3649908 loops=1)
Output: c.input, c.function
Heap Fetches: 3649909
Buffers: shared hit=59911 read=26193 written=54
-> Materialize (cost=0.28..2.30 rows=1 width=4) (actual time=0.000..0.000 rows
input and function (both ids for another table):Column | Type | Collation | Nullable | Default
----------+---------+-----------+----------+---------
input | integer | | not null |
function | integer | | not null |
Indexes:
"functionCall_pkey" PRIMARY KEY, btree (input, function) CLUSTER
"functionCallSearch" btree (function, input)
Foreign-key constraints:
"fkey1" FOREIGN KEY (function) REFERENCES function(id) ON UPDATE CASCADE ON DELETE CASCADE
"fkey2" FOREIGN KEY (input) REFERENCES input(id)I want to find all rows that match a certain function, which is why I added the
functionCallSearch index. Here is my query:SELECT c.input FROM "functionCall" c
INNER JOIN "function" ON (function.id = c.function)
WHERE function.text LIKE 'getmyinode'
ORDER BY c.input DESC
LIMIT 25 OFFSET 0;
This takes forever (currently ~ 20s) because pg refuses to use the index, and decides to do a
Index Only Scan Backward on the primary key instead:```
Limit (cost=0.71..2178.97 rows=25 width=4) (actual time=12903.294..19142.568 rows=8 loops=1)
Output: c.input
Buffers: shared hit=59914 read=26193 written=54
-> Nested Loop (cost=0.71..135662.48 rows=1557 width=4) (actual time=12903.292..19142.561 rows=8 loops=1)
Output: c.input
Inner Unique: true
Join Filter: (c.function = function.id)
Rows Removed by Join Filter: 3649900
Buffers: shared hit=59914 read=26193 written=54
-> Index Only Scan Backward using "functionCall_pkey" on public."functionCall" c (cost=0.43..80906.80 rows=3650225 width=8) (actual time=0.040..17083.489 rows=3649908 loops=1)
Output: c.input, c.function
Heap Fetches: 3649909
Buffers: shared hit=59911 read=26193 written=54
-> Materialize (cost=0.28..2.30 rows=1 width=4) (actual time=0.000..0.000 rows
Solution
PostgreSQL estimates that there will be 1557 rows that satisfy the condition, so it thinks that it will be faster if it avoids an explicit sort and rather scans the rows in
Unfortunately, that doesn't work out at all, and PostgreSQL has to scan the whole table that way, since there are only 8 matches total.
The problem seems to be that the estimates are quite off: it thinks that the index scan on
As a first measure, try to calculate new distribution statistics on the table:
If that alone does not improve the estimate, increase the granularity:
That should improve the estimate. You can try with higher values than 1000 too.
If all fails, tell PostgreSQL explicitly not to use that strategy by using
ORDER BY order and does nested loop joins until it has found enough rows.Unfortunately, that doesn't work out at all, and PostgreSQL has to scan the whole table that way, since there are only 8 matches total.
The problem seems to be that the estimates are quite off: it thinks that the index scan on
public."functionCall" for function will produce 5897 rows, when really there are only 8.As a first measure, try to calculate new distribution statistics on the table:
ANALYZE public."functionCall";If that alone does not improve the estimate, increase the granularity:
ALTER TABLE public."functionCall" ALTER function SET STATISTICS 1000;
ANALYZE public."functionCall";That should improve the estimate. You can try with higher values than 1000 too.
If all fails, tell PostgreSQL explicitly not to use that strategy by using
ORDER BY c.input + 0 DESCCode Snippets
ANALYZE public."functionCall";ALTER TABLE public."functionCall" ALTER function SET STATISTICS 1000;
ANALYZE public."functionCall";ORDER BY c.input + 0 DESCContext
StackExchange Database Administrators Q#255192, answer score: 6
Revisions (0)
No revisions yet.