HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlModerate

Character varying index overhead & length limit

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
varyinglengthlimitcharacteroverheadindex

Problem

I have a unique constraint on a character varying column that will mostly have lengths of approximately 600 but could get to 10 million.

I have read about B-Trees, but I can't determine how much disk space the index will consume or if such large-sized values will prevent the index from functioning performantly.

What are the impacts upon disk space and time of a B-Tree index on character varying columns with such potentially large lengths?

Solution

In Postgres, a unique constraint is implemented with a unique B-tree index. The manual:

Adding a unique constraint will automatically create a unique B-tree
index on the column or group of columns used in the constraint.

Indexes use the same basic storing mechanisms as tables: arrays of data pages. Indexes have some minor additional features. The manual:

In a table, an item is a row; in an index, an item is an index entry.

Every table and index is stored as an array of pages of a fixed size
(usually 8 kB, although a different page size can be selected when
compiling the server).

The maximum size of an index entry is roughly a third of a data page. See:

  • Why not ERROR: index row size xxxx exceeds maximum 2712 for index "foo"?



The maximum size was reduced to 2704 bytes in Postgres 12. Still the same in Postgres 15. Meaning your UNIQUE constraint is not possible.

Consider a unique index on hash value, either after adding a redundant column with the hash, or just an index on the hash expression.
You could use the built-in function md5() or something more efficient for huge values. Related:

  • Why doesn't my UNIQUE constraint trigger?



  • Index max row size error

Context

StackExchange Database Administrators Q#69161, answer score: 11

Revisions (0)

No revisions yet.