HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlpostgresqlModerate

Array column overlap and containment queries with GIN indexes

Submitted by: @seed··
0
Viewed 0 times
array gin indexarray overlaparray containmentANY operatorarray query performance

Problem

Querying array columns with ANY() or = ANY() skips available GIN indexes and falls back to sequential scans on large tables, causing severe performance degradation.

Solution

Use GIN-indexed array operators && (overlap), @> (contains), <@ (contained by) instead of ANY():

CREATE INDEX idx_tags_gin ON posts USING gin(tags);

-- SLOW (seq scan even with GIN):
SELECT * FROM posts WHERE 'postgresql' = ANY(tags);

-- FAST (uses GIN index):
SELECT * FROM posts WHERE tags @> ARRAY['postgresql'];
SELECT * FROM posts WHERE tags && ARRAY['postgresql', 'sql'];

Why

The GIN index for arrays is built for the array operators defined in the intarray or array_ops opclass. The ANY() construct is a scalar comparison that the planner cannot rewrite to use the GIN index path.

Gotchas

  • = ANY() with a literal array constant can sometimes use an index scan on the array element, not a GIN index
  • intarray extension provides a faster GIN opclass for integer arrays
  • Array GIN indexes grow large quickly; monitor with pg_relation_size()

Code Snippets

Array GIN index usage: correct vs incorrect operator

CREATE TABLE posts (id bigserial PRIMARY KEY, tags text[]);
CREATE INDEX idx_posts_tags ON posts USING gin(tags);

-- Uses index:
SELECT id FROM posts WHERE tags @> '{postgresql,sql}'::text[];

-- Skips index:
SELECT id FROM posts WHERE 'postgresql' = ANY(tags);

Context

Querying tables that store tags, permissions, or categories as PostgreSQL arrays

Revisions (0)

No revisions yet.