patternsqlModerate
Does it make sense to store a couple of Boolean values as array?
Viewed 0 times
booleanarraymakestoresensecoupledoesvalues
Problem
I have a table with five Boolean columns. In 90% plus of the rows, all the columns are null. (
Instead of having Boolean columns, I could have a single array column which contains an array of an enumerated custom datatype, and thereby store only the columns which are non-null.
I feel weird using arrays, but my coworker has pointed out to me that there isn't really a strong reason against using them, and we might actually see savings from using them since we aren't storing a bunch of empty columns.
Are there any downsides to using an array? Specifically: would they take up more space, take up more time to query, or prevent Postgres features (e.g. gin indexes) from being used?
False is equivalent to null for me.)Instead of having Boolean columns, I could have a single array column which contains an array of an enumerated custom datatype, and thereby store only the columns which are non-null.
I feel weird using arrays, but my coworker has pointed out to me that there isn't really a strong reason against using them, and we might actually see savings from using them since we aren't storing a bunch of empty columns.
Are there any downsides to using an array? Specifically: would they take up more space, take up more time to query, or prevent Postgres features (e.g. gin indexes) from being used?
Solution
TL;DR: Do not use an array. Use individual
Your coworker may not be aware of actual storage requirements. Null storage is very cheap and efficient in Postgres. See:
A couple of
Arrays have an overhead of 24 bytes. Plus actual data. And handling is much less convenient and much more expensive:
This related answer has a detailed assessment for three relevant options:
boolean columns.Your coworker may not be aware of actual storage requirements. Null storage is very cheap and efficient in Postgres. See:
- Do nullable columns occupy additional space in PostgreSQL?
A couple of
boolean columns are also very cheap and efficient - nullable or not. Actually, just 5 boolean columns hardly matter for the row size at all - even if defined NOT NULL. 1 byte each, 5 bytes and no alignment restrictions. Hardly relevant. There are typically much more substantial things you can do to your row. (You should have provided your actual table definition.)- Configuring PostgreSQL for read performance
Arrays have an overhead of 24 bytes. Plus actual data. And handling is much less convenient and much more expensive:
- Calculating and saving space in PostgreSQL
This related answer has a detailed assessment for three relevant options:
- separate
booleancolumns
- an
integerto encode up to 32 boolean values
- a bitstring (
bit(n)orbit varying(n))
- Is there any difference between integer and bit(n) data types for a bitmask?
Context
StackExchange Database Administrators Q#143766, answer score: 11
Revisions (0)
No revisions yet.