snippetsqlMinor
ts_rank massively slows my query, how can I improve performance?
Viewed 0 times
canqueryimprovemassivelyslowsts_rankperformancehow
Problem
Here's the table definition
and the query
Here's the query plan for the slow query
```
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=57.90..57.92 rows=5 width=830) (actual time=452.003..452.028 rows=25 loops=1)
-> Sort (cost=57.90..57.92 rows=5 width=830) (actual time=452.001..452.010 rows=25 loops=1)
Sort Key: (ts_rank(to_tsvector('english'::regconfig, kb_article_contents.contents), plainto_tsquery('string'::text)))
Sort Method: top-N heapsort Memory: 17kB
-> Hash Join (cost=21.36..57.85 rows=5 width=830) (actual time=17.688..451.334 rows=299 loops=1)
Hash Cond: (kb_articles.id = kb_article_contents.article_id)
-> Seq Scan on kb_articles (cost=0.00..32.06 rows=1156 width=4) (actual time=0.008..1.059 rows=1156 loops=1)
Filter: published
-> Hash (cost=21.30..21.30 rows=5 width=828) (actual time=1.175..1.175 row
Table "public.kb_article_contents"
Column | Type | Modifiers
------------+----------+-----------
article_id | smallint | not null
contents | text | not null
keywords | text | not null
Indexes:
"contents_idx" gin (to_tsvector('english'::regconfig, contents))
Foreign-key constraints:
"kb_article_contents_article_id_fkey" FOREIGN KEY (article_id) REFERENCES kb_articles(id)and the query
support=> EXPLAIN ANALYSE
SELECT
id,
-- if we remove the next line runtimes speeds up (and of course the order by)
ts_rank(to_tsvector( 'english', contents ), plainto_tsquery('string')) AS rank
FROM
kb_article_contents
INNER JOIN kb_articles
ON ( kb_article_contents.article_id = kb_articles.id )
WHERE
published = 'true'
AND
to_tsvector( 'english', contents ) @@ plainto_tsquery('string')
ORDER BY rank DESC
LIMIT 25
;Here's the query plan for the slow query
```
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=57.90..57.92 rows=5 width=830) (actual time=452.003..452.028 rows=25 loops=1)
-> Sort (cost=57.90..57.92 rows=5 width=830) (actual time=452.001..452.010 rows=25 loops=1)
Sort Key: (ts_rank(to_tsvector('english'::regconfig, kb_article_contents.contents), plainto_tsquery('string'::text)))
Sort Method: top-N heapsort Memory: 17kB
-> Hash Join (cost=21.36..57.85 rows=5 width=830) (actual time=17.688..451.334 rows=299 loops=1)
Hash Cond: (kb_articles.id = kb_article_contents.article_id)
-> Seq Scan on kb_articles (cost=0.00..32.06 rows=1156 width=4) (actual time=0.008..1.059 rows=1156 loops=1)
Filter: published
-> Hash (cost=21.30..21.30 rows=5 width=828) (actual time=1.175..1.175 row
Solution
ts_rank will rank all rows, most of which will probably have a ranking of 0. So it's faster to do a search without ts_rank and then ts_rank the results of that search using the same query. So you'd probably want to do something likeSELECT score.id, ts_rank(score.contents::tsvector, plainto_tsquery('string')) as rank from
(SELECT
id, contents
FROM
kb_article_contents
INNER JOIN kb_articles
ON ( kb_article_contents.article_id = kb_articles.id )
WHERE
published = 'true'
AND
contents::tsvector @@ plainto_tsquery('string'))
as score
order by rank desc;Code Snippets
SELECT score.id, ts_rank(score.contents::tsvector, plainto_tsquery('string')) as rank from
(SELECT
id, contents
FROM
kb_article_contents
INNER JOIN kb_articles
ON ( kb_article_contents.article_id = kb_articles.id )
WHERE
published = 'true'
AND
contents::tsvector @@ plainto_tsquery('string'))
as score
order by rank desc;Context
StackExchange Database Administrators Q#4054, answer score: 3
Revisions (0)
No revisions yet.