patternsqlMinor
Postgres 11: Not using the correct indices after upgrade
Viewed 0 times
indicesafterthepostgrescorrectusingnotupgrade
Problem
We upgraded lately our database from PostgreSQL 9.6 to 11 on RDS (after the upgrade we did
We noticed that a frequent query changed its execution plan, and now it is not using the correct indices.
The query is doing lots of filters and has
The most helpful filters here are
In Postgres 9.6 the execution plan is:
The execution time is fast and we are using our compound index.
In Postgres 11:
```
Limit (cost=1000.49..30635.60 rows=3000 width=1149) (actual time=7.064..2100.604 rows=2479 loops=1)
-> Gather Merge (cost=1000.49..1138801.15 rows=115181 width=1149) (actual time=7.062..2101.843 rows=2479 loops=1)
Workers Planned: 4
Workers Launched: 4
-> Parallel Index Scan Backward using posts_pkey on posts (cost=0.43..1124081.93 rows=28795 width=1149) (actual time=1.831..1679.456 rows=496 loops=5)
Filter: ((delete_at IS NULL) AND approved AND (tenant_id = 1582) AND (status = ANY ('{0,1,2,3,6,9}'::integer[])))
Rows Remo
vacuum full and analyze for the DB).We noticed that a frequent query changed its execution plan, and now it is not using the correct indices.
SELECT "posts".*
FROM "posts"
WHERE "posts"."delete_at" IS NULL
AND "posts"."tenant_id" = 1582
AND "posts"."approved" = TRUE
AND (posts.status in (0, 1, 2, 3, 6, 9))
ORDER BY id desc
LIMIT 3000 OFFSET 0The query is doing lots of filters and has
limit+offset for pagination.The most helpful filters here are
tenant_id and status, on which we have a compound index for it.In Postgres 9.6 the execution plan is:
Limit (cost=36057.09..36064.59 rows=3000 width=1082) (actual time=7.159..8.404 rows=2018 loops=1)
-> Sort (cost=36057.09..36130.91 rows=29529 width=1082) (actual time=7.159..7.664 rows=2018 loops=1)
Sort Key: id DESC
Sort Method: quicksort Memory: 1091kB
-> Index Scan using index_posts_on_tenant_id_and_status on posts (cost=0.56..34204.03 rows=29529 width=1082) (actual time=0.022..5.052 rows=2018 loops=1)
Index Cond: ((tenant_id = 1582) AND (status = ANY ('{0,1,2,3,6,9}'::integer[])))
Filter: ((delete_at IS NULL) AND approved)
Planning time: 0.289 ms
Execution time: 8.856 msThe execution time is fast and we are using our compound index.
In Postgres 11:
```
Limit (cost=1000.49..30635.60 rows=3000 width=1149) (actual time=7.064..2100.604 rows=2479 loops=1)
-> Gather Merge (cost=1000.49..1138801.15 rows=115181 width=1149) (actual time=7.062..2101.843 rows=2479 loops=1)
Workers Planned: 4
Workers Launched: 4
-> Parallel Index Scan Backward using posts_pkey on posts (cost=0.43..1124081.93 rows=28795 width=1149) (actual time=1.831..1679.456 rows=496 loops=5)
Filter: ((delete_at IS NULL) AND approved AND (tenant_id = 1582) AND (status = ANY ('{0,1,2,3,6,9}'::integer[])))
Rows Remo
Solution
Your query is of this shape:
There are two approaches to process a query like that
-
Use indexes to speed up `
Then perhaps that plan will become less attractive. It may not work out, because the other plan could also be parallelized.
-
Maybe the best method: Try to fix the row count mis-estimate so that PostgreSQL knows what to expect.
Then see if the estimate is better.
If that does not do the trick, try extended statistics:
SELECT ...
FROM atable
WHERE
ORDER BY
LIMIT n;There are two approaches to process a query like that
-
Use indexes to speed up `
and sort the result.
-
Use an index on to get the results in sorted order and discard everything that does not satisfy until we reach the LIMIT.
Now it seems that the cost estimates for both paths are pretty close: 36000 vs. 30500, and the estimate in the latter case is so low only because you have 5 cores working on the query. You must have increased max_parallel_workers_per_gather, and from the "short" execution time of 2 seconds I even suspect that you artificially increased the degree of parallelism by setting the parallel_workers storage parameter on the table.
Moreover, PostgreSQL 9.6 doesn't have parallel index scan, so it isn't even tempted to choose the second path, because with one process the index scan would take longer.
Now that strategy does not work out: In reality, there are only 2479 matching rows rather than the estimated 28795, so the index scan cannot stop early, but has to scan the complete index, because the LIMIT is never reached. Index scans are slow, so the plan performs much worse than PostgreSQL thought.
In a way, the increased capabilities in v11 opened the door to yet another way to process the query that turned out to be a trap.
What can you do to put PostgreSQL on the right track:
-
The crude method: change the ORDER BY clause so that it doesn't match the indexed expression:
ORDER BY id + 0
Then the bad road is blocked. But it can also not be taken in cases where it would be the better road...
-
The better method: reduce the degree of parallelism, either by lowering max_parallel_workers_per_gather globally or by setting the parallel_workers` storage parameter on the table to a lower value.Then perhaps that plan will become less attractive. It may not work out, because the other plan could also be parallelized.
-
Maybe the best method: Try to fix the row count mis-estimate so that PostgreSQL knows what to expect.
ALTER INDEX posts
ALTER status SET STATISTICS 1000,
ALTER tenant_id SET STATISTICS 1000;
ANALYZE posts;Then see if the estimate is better.
If that does not do the trick, try extended statistics:
CREATE STATISTICS posts_ext_stats (dependencies)
ON tenant_id, status FROM posts;
ANALYZE posts;Code Snippets
SELECT ...
FROM atable
WHERE <condition>
ORDER BY <expression>
LIMIT n;ORDER BY id + 0ALTER INDEX posts
ALTER status SET STATISTICS 1000,
ALTER tenant_id SET STATISTICS 1000;
ANALYZE posts;CREATE STATISTICS posts_ext_stats (dependencies)
ON tenant_id, status FROM posts;
ANALYZE posts;Context
StackExchange Database Administrators Q#258984, answer score: 3
Revisions (0)
No revisions yet.