principlesqlModerate
Disk usage of integer column vs boolean column in Postgres
Viewed 0 times
booleandiskcolumnpostgresusageinteger
Problem
I saw this question Bit vs. Boolean columns.
I'm asking myself the same for Postgres: does a single digit integer column occupy the same disk space of a boolean one? In big tables (~50 columns x ~50 million rows) which one perform best?
How can I find this out?
I'm asking myself the same for Postgres: does a single digit integer column occupy the same disk space of a boolean one? In big tables (~50 columns x ~50 million rows) which one perform best?
How can I find this out?
Solution
You can find out the storage size with
However you need to take alignment into account:
If you define a table like this:
you will get three unused “padding” bytes between the columns, so that the integer can be stored at an address that is divisible by 4.
So the
If you specify the columns the other way around, the space taken up by the data in each row would only be 5 bytes.
The table row itself (the “tuple”) does not only consist on the raw data, but there is a 23-byte “tuple header” for each row (see the documentation). There may be padding after the header so that the actual tuple data are aligned at a multiple of
So if you want to optimize your table to use as little storage as possible, you need to take the order of the columns in the table into account.
SELECT typname, typlen FROM pg_type WHERE typname IN ('bool', 'int4');
typname | typlen
---------+--------
bool | 1
int4 | 4
(2 rows)However you need to take alignment into account:
SELECT typname, typlen, typalign FROM pg_type WHERE typname IN ('bool', 'int4');
typname | typlen | typalign
---------+--------+----------
bool | 1 | c
int4 | 4 | i
(2 rows)c is “character” (1 byte), while i is “integer”.If you define a table like this:
CREATE TABLE a (
b boolean,
i integer
);you will get three unused “padding” bytes between the columns, so that the integer can be stored at an address that is divisible by 4.
So the
boolean would take up 4 rather than 1 bytes of storage.If you specify the columns the other way around, the space taken up by the data in each row would only be 5 bytes.
The table row itself (the “tuple”) does not only consist on the raw data, but there is a 23-byte “tuple header” for each row (see the documentation). There may be padding after the header so that the actual tuple data are aligned at a multiple of
MAXALIGN (typically 8).So if you want to optimize your table to use as little storage as possible, you need to take the order of the columns in the table into account.
Code Snippets
SELECT typname, typlen FROM pg_type WHERE typname IN ('bool', 'int4');
typname | typlen
---------+--------
bool | 1
int4 | 4
(2 rows)SELECT typname, typlen, typalign FROM pg_type WHERE typname IN ('bool', 'int4');
typname | typlen | typalign
---------+--------+----------
bool | 1 | c
int4 | 4 | i
(2 rows)CREATE TABLE a (
b boolean,
i integer
);Context
StackExchange Database Administrators Q#246243, answer score: 14
Revisions (0)
No revisions yet.