patternsqlMinor
Index method for very few updates and many inserts
Viewed 0 times
methodinsertsfewformanyveryandindexupdates
Problem
I am using Postgresql 9.1 with pg_trgm extension. I need to create an index on a text-based field. I do not need full-text searches, I use
I will use
What will be the index update overhead for a
ILIKE queries to make my searches.I will use
pg_trgm but do not have much experience with gin and gist indexes. I will have many INSERT statements (~15000 daily) and very few UPDATE statements (may be 1 or 2 in a week). What will be the index update overhead for a
gin index on a such table? Or is gist more suitable?Solution
The manual for the pg_trgm module has some advice for your question here:
As a rule of thumb, a GIN index is faster to search than a GiST index,
but slower to build or update; so GIN is better suited for static data
and GiST for often-updated data.
The
More basic advice on GIN vs GiST in the manual here.
In particular, if your rows are big and you insert in bulks, you may want to raise the
For very big loads (large parts of the table), it might pay to drop the index and recreate it afterwards.
Although, 15k rows spread out over a day, shouldn't be a problem at all.
Finally, there is a whole chapter on performance of GIN and GiST in the Full Text Search section. Applies generally. Read this.
The gist of it (no pun intended): GIN is considerably bigger and more expensive to maintain, but faster for most lookups (special exceptions apply for pg_trgm). Since the addition of
As a rule of thumb, a GIN index is faster to search than a GiST index,
but slower to build or update; so GIN is better suited for static data
and GiST for often-updated data.
The
FASTUPDATE feature of GIN indexes (introduced in Postgres 8.4, ON by default) should be interesting for you. INSERTs (and UPDATEs) are stored in a pending list and integrated into the index in later bulk operations by VACUUM. Makes those operations a lot faster, since direct GIN maintenance tends to be expensive, especially for columns with many indexable elements, i.e. many words in your case.More basic advice on GIN vs GiST in the manual here.
In particular, if your rows are big and you insert in bulks, you may want to raise the
work_mem setting.For very big loads (large parts of the table), it might pay to drop the index and recreate it afterwards.
Although, 15k rows spread out over a day, shouldn't be a problem at all.
Finally, there is a whole chapter on performance of GIN and GiST in the Full Text Search section. Applies generally. Read this.
The gist of it (no pun intended): GIN is considerably bigger and more expensive to maintain, but faster for most lookups (special exceptions apply for pg_trgm). Since the addition of
FASTUPDATE writes are only moderately more expensive.Context
StackExchange Database Administrators Q#46685, answer score: 6
Revisions (0)
No revisions yet.