patternsqlMinor
Why is this query with WHERE, ORDER BY and LIMIT so slow?
Viewed 0 times
thiswhyorderwithlimitwherequeryslowand
Problem
Given this table
Table "public.posts_lists"
Column | Type | Collation | Nullable | Default
------------+------------------------+-----------+----------+---------
id | character varying(20) | | not null |
user_id | character varying(20) | | |
tags | jsonb | | |
score | integer | | |
created_at | integer | | |
Indexes:
"tmp_posts_lists_pkey1" PRIMARY KEY, btree (id)
"tmp_posts_lists_idx_create_at1532588309" btree (created_at)
"tmp_posts_lists_idx_score_desc1532588309" btree (score_rank(score, id::text) DESC)
"tmp_posts_lists_idx_tags1532588309" gin (jsonb_array_lower(tags))
"tmp_posts_lists_idx_user_id1532588309" btree (user_id)
Getting a list by tag is fast:
Bitmap Heap Scan on posts_lists (cost=1397.50..33991.24 rows=10000 width=56) (actual time=0.110..0.132 rows=2 loops=1)
Recheck Cond: (jsonb_array_lower(tags) ? 'qui'::text)
Heap Blocks: exact=2
-> Bitmap Index Scan on tmp_posts_lists_idx_tags1532588309 (cost=0.00..1395.00 rows=10000 width=0) (actual time=0.010..0.010 rows=2 loops=1)
Index Cond: (jsonb_array_lower(tags) ? 'qui'::text)
Planning time: 0.297 ms
Execution time: 0.157 ms
Getting a list ordered by score, limit 100 - also fast:
Limit (cost=0.56..12.03 rows=100 width=88) (actual time=0.074..0.559 rows=100 loops=1)
-> Index Scan using tmp_posts_lists_idx_score_desc1532588309 on posts_lists (cost=0.56..1146999.15 rows=10000473 width=88) (actual time=0.072..0.535 rows=100 loops=1)
Planning time: 0.586 ms
Execution time: 0.714 ms
But combining the above two queries is very slow:
```
EXPLAIN ANALYSE
SEL
posts_lists:Table "public.posts_lists"
Column | Type | Collation | Nullable | Default
------------+------------------------+-----------+----------+---------
id | character varying(20) | | not null |
user_id | character varying(20) | | |
tags | jsonb | | |
score | integer | | |
created_at | integer | | |
Indexes:
"tmp_posts_lists_pkey1" PRIMARY KEY, btree (id)
"tmp_posts_lists_idx_create_at1532588309" btree (created_at)
"tmp_posts_lists_idx_score_desc1532588309" btree (score_rank(score, id::text) DESC)
"tmp_posts_lists_idx_tags1532588309" gin (jsonb_array_lower(tags))
"tmp_posts_lists_idx_user_id1532588309" btree (user_id)
Getting a list by tag is fast:
EXPLAIN ANALYSE
SELECT * FROM posts_lists
WHERE jsonb_array_lower(tags) ? lower('Qui');Bitmap Heap Scan on posts_lists (cost=1397.50..33991.24 rows=10000 width=56) (actual time=0.110..0.132 rows=2 loops=1)
Recheck Cond: (jsonb_array_lower(tags) ? 'qui'::text)
Heap Blocks: exact=2
-> Bitmap Index Scan on tmp_posts_lists_idx_tags1532588309 (cost=0.00..1395.00 rows=10000 width=0) (actual time=0.010..0.010 rows=2 loops=1)
Index Cond: (jsonb_array_lower(tags) ? 'qui'::text)
Planning time: 0.297 ms
Execution time: 0.157 ms
Getting a list ordered by score, limit 100 - also fast:
EXPLAIN ANALYSE
SELECT *
FROM posts_lists
ORDER BY score_rank(score, id) DESC
LIMIT 100;Limit (cost=0.56..12.03 rows=100 width=88) (actual time=0.074..0.559 rows=100 loops=1)
-> Index Scan using tmp_posts_lists_idx_score_desc1532588309 on posts_lists (cost=0.56..1146999.15 rows=10000473 width=88) (actual time=0.072..0.535 rows=100 loops=1)
Planning time: 0.586 ms
Execution time: 0.714 ms
But combining the above two queries is very slow:
```
EXPLAIN ANALYSE
SEL
Solution
The problem with this statement is, that the query planner has no usable statistics for
As seen in the first explain:
The planner expects 10.000 rows to be returned if he uses the filter
The same happens in the last statement. Because of this missing information, the planner assumes, that a scan on the index
You could try to avoid the need of
Another method would be:
This statement uses a CTE as optimization fence, but be careful, the performance may get worse for other
jsonb_array_lower(tags).As seen in the first explain:
(cost=0.00..1395.00 rows=10000 width=0) (actual time=0.010..0.010 rows=2 loops=1)The planner expects 10.000 rows to be returned if he uses the filter
jsonb_array_lower(tags) ? lower('Qui') but there are just two rows returned.The same happens in the last statement. Because of this missing information, the planner assumes, that a scan on the index
tmp_posts_lists_idx_score_desc1532588309 would be more efficient.You could try to avoid the need of
lower and normalize the input during INSERT and UPDATE.Another method would be:
WITH c AS (
SELECT id FROM posts_lists
WHERE jsonb_array_lower(tags) ? lower('Qui')
)
SELECT * FROM posts_lists l, c
WHERE l.id = c.id
ORDER BY score_rank(score, id) DESC
LIMIT 100;This statement uses a CTE as optimization fence, but be careful, the performance may get worse for other
WHERE conditions or table content.Code Snippets
(cost=0.00..1395.00 rows=10000 width=0) (actual time=0.010..0.010 rows=2 loops=1)WITH c AS (
SELECT id FROM posts_lists
WHERE jsonb_array_lower(tags) ? lower('Qui')
)
SELECT * FROM posts_lists l, c
WHERE l.id = c.id
ORDER BY score_rank(score, id) DESC
LIMIT 100;Context
StackExchange Database Administrators Q#213262, answer score: 3
Revisions (0)
No revisions yet.