patternsqlModerate
Postgres full text search on words, not lexemes
Viewed 0 times
fullsearchpostgrestextwordslexemesnot
Problem
I've got a table with columns of text that I'd like to search by word, not lexeme. More to the point, that I'd like to have indexed by word rather than lexeme. We've got error dumps with a lot of code references, which don't work well with any natural language dictionary.
Is there a way in Postgres to have FTS parse by word boundaries without resolving words to lexemes? If I have to define a list of boundary characters and a catalog of skip words, that could be fine. Does this require crafting some kind of custom dictionary, or is there something already like this available?
I keep thinking I'm missing something obvious, and then not finding it.
For the minute, trigram indexes are okay, but I would really prefer what amounts to a unique keyword parser for the text.
Postgres 11.4 on RDS.
Is there a way in Postgres to have FTS parse by word boundaries without resolving words to lexemes? If I have to define a list of boundary characters and a catalog of skip words, that could be fine. Does this require crafting some kind of custom dictionary, or is there something already like this available?
I keep thinking I'm missing something obvious, and then not finding it.
For the minute, trigram indexes are okay, but I would really prefer what amounts to a unique keyword parser for the text.
Postgres 11.4 on RDS.
Solution
Just use the
And query accordingly:
Related:
simple text search configuration, which works with the simple dictionary and does not do any stemming:CREATE INDEX ON tbl USING GIN (to_tsvector('simple', data));And query accordingly:
SELECT *
FROM tbl
WHERE to_tsvector('simple', data) @@ to_tsquery('simple', 'foo');Related:
- Get partial match from GIN indexed TSVECTOR column
- Finding the most commonly used non-stop words in a column
- PostgreSQL full text search on many columns
Code Snippets
CREATE INDEX ON tbl USING GIN (to_tsvector('simple', data));SELECT *
FROM tbl
WHERE to_tsvector('simple', data) @@ to_tsquery('simple', 'foo');Context
StackExchange Database Administrators Q#251177, answer score: 11
Revisions (0)
No revisions yet.