snippetsqlMinor
How do I use an Ispell dictionary with Postgres text search?
Viewed 0 times
postgressearchwithispelltexthowdictionaryuse
Problem
Postgres can use Ispell-compatible dictionaries in text search, but does not provide the required files.
Solution
This example uses the Canadian English dictionary, but you can try it with others as well.
These are the steps needed for Windows:
In PgAdmin, run the following SQL:
You will need to create a new text search configuration to use the dictionary.
These are the steps needed for Windows:
- Open http://src.chromium.org/svn/trunk/deps/third_party/hunspell_dictionaries/en_CA.dic
- Select all of the text, copy it, and paste it to Text Mechanic: http://textmechanic.co/Sort-Text-Lines.html. Add a line break at the end.
- Open http://src.chromium.org/svn/trunk/deps/third_party/hunspell_dictionaries/en_CA.dic_delta
- Select all of the text, copy it, and paste it below the previously pasted text in Text Mechanic.
- Scroll to the top, and select and cut the first line (should be a five digit number), and get rid of the line break.
- Click the Alphabetical button, and wait for the text to sort.
- Select all of the text and copy it to the clipboard
- Open Windows Notepad as an administrator
- Paste the text from Step 7 into Notepad
- Save the file as en_ca.dict (with UTF-8 encoding) to your Postgres text search folder. Mine is C:\Program Files\PostgreSQL\9.3\share\tsearch_data .
- Open http://src.chromium.org/svn/trunk/deps/third_party/hunspell_dictionaries/en_CA.aff , select all, copy, and paste to Notepad. Save the file as en_ca.affix to your Postgres text search folder.
In PgAdmin, run the following SQL:
create text search dictionary ispell_en_ca (
template = ispell,
dictfile = en_ca,
afffile = en_ca,
stopwords = english
);
--make sure it works:
select * from ts_lexize('ispell_en_ca', 'colours');
/*
result:
ts_lexize
text[]
{coloured,colour}
*/You will need to create a new text search configuration to use the dictionary.
Code Snippets
create text search dictionary ispell_en_ca (
template = ispell,
dictfile = en_ca,
afffile = en_ca,
stopwords = english
);
--make sure it works:
select * from ts_lexize('ispell_en_ca', 'colours');
/*
result:
ts_lexize
text[]
{coloured,colour}
*/Context
StackExchange Database Administrators Q#57058, answer score: 7
Revisions (0)
No revisions yet.