HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

postgres bloom index

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
indexbloompostgres

Problem

i read the doc about postgres bloom, but can't reproduce the same results, please help me to understand what i missed.
my server is:

SHOW server_version;
        server_version         
-------------------------------
 10.6 (Debian 10.6-1.pgdg90+1)

dev=# show random_page_cost;
 random_page_cost 
------------------
 4


first create the table with the same command as in the docs:

dev=# CREATE TABLE tbloom AS                                               
   SELECT
     (random() * 1000000)::int as i1,
     (random() * 1000000)::int as i2,
     (random() * 1000000)::int as i3,
     (random() * 1000000)::int as i4,
     (random() * 1000000)::int as i5,
     (random() * 1000000)::int as i6
   FROM
  generate_series(1,10000000);


next i create the btree index

dev=# CREATE index btreeidx ON tbloom (i1, i2, i3, i4, i5, i6);
CREATE INDEX


and get next plan:

dev=# EXPLAIN ANALYZE SELECT * FROM tbloom WHERE i2 = 898732 AND i5 = 123451;
                                                       QUERY PLAN                                                        
-------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=1000.00..127195.10 rows=1 width=24) (actual time=258.963..260.900 rows=0 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Parallel Seq Scan on tbloom  (cost=0.00..126195.00 rows=1 width=24) (actual time=255.446..255.446 rows=0 loops=3)
         Filter: ((i2 = 898732) AND (i5 = 123451))
         Rows Removed by Filter: 3333333
 Planning time: 0.412 ms
 Execution time: 260.939 ms


Execution time: 260.939 ms

and now drop the btree index and create bloom:

dev=# DROP INDEX btreeidx;
DROP INDEX
dev=# CREATE INDEX bloomidx ON tbloom USING bloom (i1, i2, i3, i4, i5, i6);
CREATE INDEX


get new plan:

```
dev=# EXPLAIN ANALYZE SELECT * FROM tbloom WHERE i2 = 898732 AND i5 = 123451;
QUERY

Solution

When bloom was introduced in 9.6, parallel query had just been introduced and is was turned off by default. Bloom appeared to be better than a non-parallel sequential scan in the example given. But when you can do a parallel seq scan, it appears better than using a bloom index does. It is not actually better as can be verified by turning off parallel query with set max_parallel_workers_per_gather TO 0 and looking at the actual executions speeds, but the planner thinks the parallel seq scan will be better. It looks like maybe the cost estimation part of bloom could use some work.

The example code was not updated for when parallel query was turned on by default, in v10, so it no longer works as advertised.

Note that your example never achieved any index usage at all, so you can't really draw any conclusions about which index is better for that scenario.

Context

StackExchange Database Administrators Q#229413, answer score: 4

Revisions (0)

No revisions yet.