debugsqlModerate
Why *not* ERROR: index row size xxxx exceeds maximum 2712 for index "foo"?
Viewed 0 times
whyerrormaximumfoosize2712exceedsforxxxxindex
Problem
We have repeatedly seen failing attempts to index columns with values exceeding a maximum size. Postgres 10 has this kind of error message for it:
Examples:
Etc.
Now, a_horse_with_no_name demonstrated a case with much larger
No error, and the column value indeed tested with a length of 10000 characters.
Have there been recent changes or how is this possible?
ERROR: index row size xxxx exceeds maximum 2712 for index "foo_idx"
HINT: Values larger than 1/3 of a buffer page cannot be indexed.
Consider a function index of an MD5 hash of the value, or use full text indexing.Examples:
- Character varying index overhead & length limit
- Index max row size error
Etc.
Now, a_horse_with_no_name demonstrated a case with much larger
text values (10000 characters) that still seems to work with a UNIQUE index in Postgres 9.6. Quoting his test case:create table tbl (col text);
create unique index on tbl (col);
insert into tbl
values (rpad(md5(random()::text), 10000, md5(random()::text)));
select length(val) from x; -- 10000No error, and the column value indeed tested with a length of 10000 characters.
Have there been recent changes or how is this possible?
Solution
Short answer: compression.
The data type
The manual about
also be moved out to secondary storage, but only as a last resort (
and
Test with
After compression, add 8 bytes for the tuple header, round up to the next multiple of
fiddle - Postgres 15
fiddle - Postgres 11
The maximum index tuple size was reduced to 2704 bytes in Postgres 15. Also, the new compression algorithm LZ4 was added, working remarkably well for highly repetitive patterns. See demo in the fiddle.
Note how the value is forced to be unpacked from its storage format with the no-op expression:
The 5th row would be too big to fit the index tuple (even with compression) and trigger the error message in the title. Or this one for Postgres 15:
ERROR: index row size 5784 exceeds btree version 4 maximum 2704 for index "tbl_col_idx"
The 6th row would be to big to fit even the index page and trigger the related error message:
ERROR: index row requires 11504 bytes, maximum size is 8191
The test values generated with
Related:
Long answer
I ran more extensive tests, tampering with storage internals to verify my understanding. Only for testing purposes!
dbfiddle does not allow write access to system catalogs. But the queries are there to try "at home".
The data type
text allows (lossless!) compression and storage out of line by default:SELECT typstorage FROM pg_type WHERE typname = 'text'; -- 'x'The manual about
pg_type.typstorage:p: Value must always be stored plain.
e: Value can be stored in a “secondary” relation (if relation has one, see pg_class.reltoastrelid).
m: Value can be stored compressed inline.
x: Value can be stored compressed inline or stored in “secondary” storage.x is the usual choice for toast-able types. Note that m values canalso be moved out to secondary storage, but only as a last resort (
eand
x values are moved first).Test with
pg_column_size() instead of length(). Be sure to test actual table columns (with compression applied) not just input values. See:CREATE TABLE tbl (
id int
, col text -- with default pglz compression
);
INSERT INTO tbl(id, col) VALUES
(1, rpad(md5('non_random'::text), 100, md5('non_random'::text)))
, (2, rpad(md5('non_random'::text), 1000, md5('non_random'::text)))
, (3, rpad(md5('non_random'::text), 10000, md5('non_random'::text)))
, (4, rpad(md5('non_random'::text), 100000, md5('non_random'::text)))
, (5, rpad(md5('non_random'::text), 500000, md5('non_random'::text)))
, (6, rpad(md5('non_random'::text), 1000000, md5('non_random'::text)));
SELECT id, left(col, 10) || ' ...' AS col
, length(col) AS char_length
, pg_column_size(col) AS compressed
, pg_column_size(col || '') AS uncompressed
FROM tbl ORDER BY id;id | col | char_length | compressed | uncompressed
---+----------------+-------------+------------+-------------
1 | 67ad0f29fa ... | 100 | 101 | 104
2 | 67ad0f29fa ... | 1000 | 1004 | 1004
3 | 67ad0f29fa ... | 10000 | 160 | 10004
4 | 67ad0f29fa ... | 100000 | 1191 | 100004
5 | 67ad0f29fa ... | 500000 | 5765 | 500004
6 | 67ad0f29fa ... | 1000000 | 11487 | 1000004
SELECT pg_column_size(rpad(md5('non_random'::text), 1000000, md5('non_random'::text)));pg_column_size
--------------
1000004
After compression, add 8 bytes for the tuple header, round up to the next multiple of
MAXALIGN (typically 8 bytes), and add 4 bytes for the item identifier to arrive at the total storage for the index tuple.fiddle - Postgres 15
fiddle - Postgres 11
The maximum index tuple size was reduced to 2704 bytes in Postgres 15. Also, the new compression algorithm LZ4 was added, working remarkably well for highly repetitive patterns. See demo in the fiddle.
Note how the value is forced to be unpacked from its storage format with the no-op expression:
pg_column_size(col || '').The 5th row would be too big to fit the index tuple (even with compression) and trigger the error message in the title. Or this one for Postgres 15:
ERROR: index row size 5784 exceeds btree version 4 maximum 2704 for index "tbl_col_idx"
The 6th row would be to big to fit even the index page and trigger the related error message:
ERROR: index row requires 11504 bytes, maximum size is 8191
The test values generated with
rpad() have repeating patterns, which allow for massive compression. Even very long strings still easily fit the max. size after compression this way.Related:
- What is the overhead for varchar(n)?
Long answer
I ran more extensive tests, tampering with storage internals to verify my understanding. Only for testing purposes!
- db<>fiddle here
dbfiddle does not allow write access to system catalogs. But the queries are there to try "at home".
Code Snippets
SELECT typstorage FROM pg_type WHERE typname = 'text'; -- 'x'p: Value must always be stored plain.
e: Value can be stored in a “secondary” relation (if relation has one, see pg_class.reltoastrelid).
m: Value can be stored compressed inline.
x: Value can be stored compressed inline or stored in “secondary” storage.CREATE TABLE tbl (
id int
, col text -- with default pglz compression
);
INSERT INTO tbl(id, col) VALUES
(1, rpad(md5('non_random'::text), 100, md5('non_random'::text)))
, (2, rpad(md5('non_random'::text), 1000, md5('non_random'::text)))
, (3, rpad(md5('non_random'::text), 10000, md5('non_random'::text)))
, (4, rpad(md5('non_random'::text), 100000, md5('non_random'::text)))
, (5, rpad(md5('non_random'::text), 500000, md5('non_random'::text)))
, (6, rpad(md5('non_random'::text), 1000000, md5('non_random'::text)));
SELECT id, left(col, 10) || ' ...' AS col
, length(col) AS char_length
, pg_column_size(col) AS compressed
, pg_column_size(col || '') AS uncompressed
FROM tbl ORDER BY id;SELECT pg_column_size(rpad(md5('non_random'::text), 1000000, md5('non_random'::text)));Context
StackExchange Database Administrators Q#217087, answer score: 17
Revisions (0)
No revisions yet.