patternsqlMajor
How is LIKE implemented?
Viewed 0 times
implementedlikehow
Problem
Can anyone explain how the LIKE operator is implemented in current database systems (e.g. MySQL or Postgres)? or point me to some references that explain it?
The naive approach would be to inspect each record, executing a regular expression or partial string match on the field of interest, but I have a feeling (hope) that these systems do something smarter.
The naive approach would be to inspect each record, executing a regular expression or partial string match on the field of interest, but I have a feeling (hope) that these systems do something smarter.
Solution
In addition to what Justin Cave wrote, since PostgreSQL 9.1 you can speed up any search with
Create an index of the form:
Or:
Creating and maintaining a GIN or GiST index carries a cost, but if your table is not heavily written, this is a great feature for you.
Depesz has an excellent article in his blog about the feature.
GIN or GiST?
These two quotes from the manual should provide some guidance
The choice between GiST and GIN indexing depends on the relative
performance characteristics of GiST and GIN, which are discussed
elsewhere.
(Like here.)
But for "nearest neighbour" type of queries with the using the distance operator ``:
This can be implemented quite efficiently by GiST indexes, but not by
GIN indexes.
LIKE (~~) or ILIKE (~~*), and basic regular expression matches, too (~). Use the operator classes provided by the module pg_trgm with a GIN or GiST index to speed up LIKE expressions that are not left-anchored. To install the extension, run once per database:CREATE EXTENSION IF NOT EXISTS pg_trgm;Create an index of the form:
CREATE INDEX tbl_col_gin_trgm_idx ON tbl USING gin (col gin_trgm_ops);Or:
CREATE INDEX tbl_col_gist_trgm_idx ON tbl USING gist (col gist_trgm_ops);Creating and maintaining a GIN or GiST index carries a cost, but if your table is not heavily written, this is a great feature for you.
Depesz has an excellent article in his blog about the feature.
GIN or GiST?
These two quotes from the manual should provide some guidance
The choice between GiST and GIN indexing depends on the relative
performance characteristics of GiST and GIN, which are discussed
elsewhere.
(Like here.)
But for "nearest neighbour" type of queries with the using the distance operator ``:
This can be implemented quite efficiently by GiST indexes, but not by
GIN indexes.
Code Snippets
CREATE EXTENSION IF NOT EXISTS pg_trgm;CREATE INDEX tbl_col_gin_trgm_idx ON tbl USING gin (col gin_trgm_ops);CREATE INDEX tbl_col_gist_trgm_idx ON tbl USING gist (col gist_trgm_ops);Context
StackExchange Database Administrators Q#2195, answer score: 30
Revisions (0)
No revisions yet.