gotchasqlMinor
Why does this limit make the postgres planner use a much slower index scan instead of a much faster bitmap heap/index scan?
Viewed 0 times
thiswhytheplannermuchpostgresscanlimitmakeslower
Problem
I have a table of about 700k rows in Postgres 9.3.3, which has the following structure:
The query that I am making has a full text search query and a limit, as follows:
When I search for a string which is in my index with a limit and order in the query it is fast:
However if the string is not in the index it takes much longer:
```
explain analyze select * from "articles" where article_text @@ plainto_tsquery('pg_catalog.simple', 'not_in_index') order by id limit 10;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.42..1293.88 rows=10 width=1298) (actual time=5633.684..5633.684 rows=0 loops=1)
-> Index Scan using articles_pkey on article
Columns:
content_body - text
publish_date - timestamp without time zone
published - boolean
Indexes:
"articles_pkey" PRIMARY KEY, btree (id)
"article_text_gin" gin (article_text)
"articles_publish_date_id_index" btree (publish_date DESC NULLS LAST, id DESC)The query that I am making has a full text search query and a limit, as follows:
When I search for a string which is in my index with a limit and order in the query it is fast:
explain analyze select * from "articles" where article_text @@ plainto_tsquery('pg_catalog.simple', 'in_index') order by id limit 10;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.42..1293.88 rows=10 width=1298) (actual time=2.073..9.837 rows=10 loops=1)
-> Index Scan using articles_pkey on articles (cost=0.42..462150.49 rows=3573 width=1298) (actual time=2.055..9.711 rows=10 loops=1)
Filter: (article_text @@ '''in_index'''::tsquery)
Rows Removed by Filter: 611
Total runtime: 9.952 msHowever if the string is not in the index it takes much longer:
```
explain analyze select * from "articles" where article_text @@ plainto_tsquery('pg_catalog.simple', 'not_in_index') order by id limit 10;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.42..1293.88 rows=10 width=1298) (actual time=5633.684..5633.684 rows=0 loops=1)
-> Index Scan using articles_pkey on article
Solution
The big difference between the first two queries is that int the first one, it could go along the index used by the primary key of the table (and used by the
Now the second one used the same logic, but not having found a single match (not to mention 10), it had to go through the whole index and throw away all rows (
The second pair, without ordering, chose a different plan, which in this case happened to be more effective for returning 0 rows :)
And the third pair, knowing it has to return lots of rows (it planned 3573 as opposed to 10), again went for a different plan, with a bitmap heap scan (not a bitmap index scan, as in the second pair). The time difference can be attributed mostly to this node:
Sort Method: external merge Disk: 12288kB
If you raised
ORDER BY clause), then filter out the rows that don't match the WHERE condition. You can see that it had to visit about 621 rows (the 10 that got returned and 611 which were filtered) to get ready.Now the second one used the same logic, but not having found a single match (not to mention 10), it had to go through the whole index and throw away all rows (
Rows Removed by Filter: 796146).The second pair, without ordering, chose a different plan, which in this case happened to be more effective for returning 0 rows :)
And the third pair, knowing it has to return lots of rows (it planned 3573 as opposed to 10), again went for a different plan, with a bitmap heap scan (not a bitmap index scan, as in the second pair). The time difference can be attributed mostly to this node:
Sort Method: external merge Disk: 12288kB
If you raised
work_mem to a higher value (say 100 MB), this difference would mostly go away, I guess.Context
StackExchange Database Administrators Q#60419, answer score: 5
Revisions (0)
No revisions yet.