patternsqlMinor
PostgreSQL GIN index not used when ts_query language is fetched from a column
Viewed 0 times
postgresqlfromcolumnusedfetchedginlanguagewhenindexts_query
Problem
I've got a table that stores some multilingual content:
To make sure I always search in the correct language I use these queries:
Because hardcoding the language doesn't give the correct results:
The problem however is that PostgreSQL doesn't use the GIN index when using the first set of queries and instead does a sequential scan:
(Note: I've disabled scanning using
While it does when hardcoding a language:
```
EXPLAIN ANALYZE SELECT FROM search WHERE fulltext @@ to_tsquery('
CREATE TABLE search (
content text NOT NULL,
language regconfig NOT NULL,
fulltext tsvector
);
CREATE INDEX search_fulltext ON search USING GIN(fulltext);
INSERT INTO search (language, content) VALUES
('dutch', 'Als achter vliegen vliegen vliegen vliegen vliegen vliegen achterna'),
('dutch', 'Langs de koele kali liep een kale koeli met een kilo kali op zijn kale koeli-kop.'),
('dutch', 'Moeder sneed zeven scheve sneden brood'),
('english', 'I saw Susie sitting in a shoe shine shop. Where she sits she shines, and where she shines she sits.'),
('english', 'How can a clam cram in a clean cream can?'),
('english', 'Can you can a can as a canner can can a can?');
UPDATE search SET fulltext = to_tsvector(language, content);To make sure I always search in the correct language I use these queries:
SELECT FROM search WHERE fulltext @@ to_tsquery(language, 'shine');
(1 row)
SELECT FROM search WHERE fulltext @@ to_tsquery(language, 'vlieg');
(1 row)Because hardcoding the language doesn't give the correct results:
SELECT FROM search WHERE fulltext @@ to_tsquery('dutch', 'shine');
(0 rows)
SELECT FROM search WHERE fulltext @@ to_tsquery('english', 'vlieg');
(0 rows)The problem however is that PostgreSQL doesn't use the GIN index when using the first set of queries and instead does a sequential scan:
(Note: I've disabled scanning using
SET enable_seqscan = OFF; for these examples because of the low amount of rows)EXPLAIN ANALYZE SELECT * FROM search WHERE fulltext @@ to_tsquery(language, 'shine');
---
Seq Scan on search (cost=0.00..17.35 rows=2 width=136) (actual time=0.040..0.044 rows=1 loops=1)
Filter: (fulltext @@ to_tsquery(language, 'shine'::text))
Rows Removed by Filter: 5
Planning time: 0.039 ms
Execution time: 0.064 ms
(5 rows)While it does when hardcoding a language:
```
EXPLAIN ANALYZE SELECT FROM search WHERE fulltext @@ to_tsquery('
Solution
I suggest a solution with partial expression indexes:
No redundant
Create one partial expression index for every relevant language:
All of the partial indexes together are only about as big as your total index.
Then match index conditions in the query:
You get bitmap index or index scans this way.
Another index on
CREATE TABLE search (
search_id serial PRIMARY KEY
, language regconfig NOT NULL -- order of columns matters a bit
, content text NOT NULL
-- *no* redundant fulltext tsvector
);No redundant
fulltext column - makes the table smaller, which helps overall performance.Create one partial expression index for every relevant language:
CREATE INDEX search_fulltext_dutch ON search USING GIN(to_tsvector('dutch', content))
WHERE language = 'dutch'::regconfig;
CREATE INDEX search_fulltext_english ON search USING GIN(to_tsvector('english', content))
WHERE language = 'english'::regconfig;
-- more?All of the partial indexes together are only about as big as your total index.
Then match index conditions in the query:
SELECT * FROM search -- does not return useless column fulltext now
WHERE language = 'dutch'::regconfig -- match partial index condition
AND to_tsvector('dutch', content) @@ to_tsquery('dutch', 'vliegen')
UNION ALL
SELECT * FROM search
WHERE language = 'english'::regconfig
AND to_tsvector('english', content) @@ to_tsquery('english', 'vliegen');
-- more?You get bitmap index or index scans this way.
Another index on
language may be useful for other purposes, this query does not need it.Code Snippets
CREATE TABLE search (
search_id serial PRIMARY KEY
, language regconfig NOT NULL -- order of columns matters a bit
, content text NOT NULL
-- *no* redundant fulltext tsvector
);CREATE INDEX search_fulltext_dutch ON search USING GIN(to_tsvector('dutch', content))
WHERE language = 'dutch'::regconfig;
CREATE INDEX search_fulltext_english ON search USING GIN(to_tsvector('english', content))
WHERE language = 'english'::regconfig;
-- more?SELECT * FROM search -- does not return useless column fulltext now
WHERE language = 'dutch'::regconfig -- match partial index condition
AND to_tsvector('dutch', content) @@ to_tsquery('dutch', 'vliegen')
UNION ALL
SELECT * FROM search
WHERE language = 'english'::regconfig
AND to_tsvector('english', content) @@ to_tsquery('english', 'vliegen');
-- more?Context
StackExchange Database Administrators Q#149765, answer score: 3
Revisions (0)
No revisions yet.