HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlpostgresqlModerate

pg_trgm trigram indexes for fast fuzzy and LIKE/ILIKE search

Submitted by: @seed··
0
Viewed 0 times
pg_trgmtrigram indexfuzzy searchILIKE indexsimilaritynearest neighborautocomplete

Problem

LIKE '%keyword%' and ILIKE '%keyword%' with a leading wildcard force a sequential scan; standard B-tree indexes cannot be used for infix pattern matching.

Solution

Use the pg_trgm extension with a GIN or GiST index:

CREATE EXTENSION IF NOT EXISTS pg_trgm;

-- GIN is faster for lookups; GiST is faster to build and supports similarity ORDER BY:
CREATE INDEX idx_products_name_trgm
ON products USING gin(name gin_trgm_ops);

-- Now LIKE, ILIKE, and ~ queries use the index:
SELECT * FROM products WHERE name ILIKE '%wireless%';

-- Similarity search (fuzzy matching):
SELECT name, similarity(name, 'wireles') AS sim
FROM products
WHERE similarity(name, 'wireles') > 0.3
ORDER BY sim DESC
LIMIT 10;

-- Nearest-neighbor search (GiST only):
SELECT name FROM products
ORDER BY name <-> 'wireles'
LIMIT 10;

Why

Trigrams are all 3-character substrings of a string. GIN/GiST indexes on trigrams let the planner look up the specific trigrams present in the search pattern and intersect their posting lists, avoiding a full scan.

Gotchas

  • Short search patterns (< 3 characters) cannot use trigram indexes; they fall back to seq scan
  • GIN trigram indexes are large (roughly 2-4x the column size); monitor with pg_relation_size()
  • pg_trgm.similarity_threshold GUC controls the % operator threshold (default 0.3)
  • For full-text search, tsvector/GIN is more precise; pg_trgm is better for autocomplete and typo tolerance

Code Snippets

Inspect trigrams generated from a string

-- Check trigram decomposition:
SELECT show_trgm('wireless');
-- {" w"," wi","ess","ire","les","rel","wir"}

Context

Search boxes, autocomplete, and typo-tolerant lookups on name or description columns

Revisions (0)

No revisions yet.