patternsqlMinor

Optimizing ORDER BY in a full text search query

Submitted by: @import:stackexchange-dba·Mar 10, 2026·

Viewed 0 times

fullordersearchtextqueryoptimizing

Problem

I have a big table entities with ~ 15M records. I want to find top 5 rows matching 'hockey' in their name.

I have a full text index on name, which is used: gin_ix_entity_full_text_search_name

Query:

SELECT "entities".*,
         ts_rank(to_tsvector('english', "entities"."name"::text),
         to_tsquery('english', 'hockey'::text)) AS "rank0.48661998202865475"
    FROM "entities" 
         WHERE "entities"."place" = 'f' 
              AND (to_tsvector('english', "entities"."name"::text) @@ to_tsquery('english', 'hockey'::text)) 
         ORDER BY "rank0.48661998202865475" DESC LIMIT 5

Duration 25,623 ms

Explain plan
1 Limit (cost=12666.89..12666.89 rows=5 width=3116)
2 -> Sort (cost=12666.89..12670.18 rows=6571 width=3116)
3 Sort Key: (ts_rank(to_tsvector('english'::regconfig, (name)::text), '''hockey'''::tsquery))
4 -> Bitmap Heap Scan on entities (cost=124.06..12645.06 rows=6571 width=3116)
5 Recheck Cond: (to_tsvector('english'::regconfig, (name)::text) @@ '''hockey'''::tsquery)
6 Filter: (NOT place)
7 -> Bitmap Index Scan on gin_ix_entity_full_text_search_name (cost=0.00..123.74 rows=6625 width=0)
8 Index Cond: (to_tsvector('english'::regconfig, (name)::text) @@ '''hockey'''::tsquery)

I do not understand why it verifies the index condition twice. (Query plan step 4 and 7). Is it because of my boolean condition (not place)? If so, should I add it to my index to get a very fast query? Or does the sorting condition make it slow?

EXPLAIN ANALYZE output:

Limit (cost=4447.28..4447.29 rows=5 width=3116) (actual time=18509.274..18509.282 rows=5 loops=1)
-> Sort (cost=4447.28..4448.41 rows=2248 width=3116) (actual time=18509.271..18509.273 rows=5 loops=1)
Sort Key: (ts_rank(to_tsvector('english'::regconfig, (name)::text), '''test'''::tsquery))
Sort Method: top-N heapsort Memory: 19kB
-> Bitmap Heap Scan on entities (co

Solution

What I still don't understand, is why this is slower.

That sorting the rows will cost something is obvious. But why so much?

Without ORDER BY rank0... Postgres can just

pick the first 5 rows it finds and stop fetching rows as soon as it has 5.

Bitmap Heap Scan on entities ... rows=5 ...

then compute ts_rank() for just 5 rows.

In the second case, Postgres has to

fetch all (1495 according to your query plan) rows that qualify.

Bitmap Heap Scan on entities ... rows=1495 ...

compute ts_rank() for all of them.

sort all of them to find the first 5 according to the calculated value.

Try ORDER BY name just to see the cost of computing to_tsquery('english', 'hockey'::text)) for the superfluous rows and how much remains for fetching more rows and sorting.

Context

StackExchange Database Administrators Q#16437, answer score: 8

Revisions (0)

No revisions yet.