patternsqlMinor
Why are partial PostgreSQL HASH indices not smaller than full indices?
Viewed 0 times
postgresqlindiceswhyfullarethanhashsmallerpartialnot
Problem
I want to create the most efficient index for a sparsely populated column. I only need equality operations, so a HASH index should be beneficial.
Now I'm wondering why a partial HASH index isn't smaller than a full hash index:
Both hash indices take exactly the same amount of space (as shown in pgHero). However, when using standard BTREE indices, the partial index takes only 5% of the space of the full index.
Are partial HASH indices not supported in PostgreSQL 10?
Now I'm wondering why a partial HASH index isn't smaller than a full hash index:
CREATE INDEX full_hash ON mytable USING HASH(my_id); # 256 MB
CREATE INDEX partial_hash ON mytable USING HASH(my_id) WHERE my_ID IS NOT NULL; # 256 MB
CREATE INDEX full_btree ON mytable (my_id); # 537 MB
CREATE INDEX partial_btree ON mytable (my_id) WHERE my_ID IS NOT NULL; # 32 MBBoth hash indices take exactly the same amount of space (as shown in pgHero). However, when using standard BTREE indices, the partial index takes only 5% of the space of the full index.
Are partial HASH indices not supported in PostgreSQL 10?
Solution
I would argue that this is a bug in the hash index code. When you create an index on an already-populated table, it tries to pre-size the index to hold all the data so that it doesn't have to keep splitting buckets as the index is created. But the code for doing this does not take the NULL fraction of the column nor (apparently) the selectivity of the partial index clause into account, so it arrives at a too-large number for the pre-sizing.
If you were to create the index first, and then populated the table, you will find that the hash index is small, whether you made it partial or not. If the table is going to grow substantially after the index is created, the extra space consumed by the index upon original creation will be put to good use.
If you were to create the index first, and then populated the table, you will find that the hash index is small, whether you made it partial or not. If the table is going to grow substantially after the index is created, the extra space consumed by the index upon original creation will be put to good use.
Context
StackExchange Database Administrators Q#231647, answer score: 6
Revisions (0)
No revisions yet.