patternsqlMinor
postgres bloom index
Viewed 0 times
indexbloompostgres
Problem
i read the doc about postgres bloom, but can't reproduce the same results, please help me to understand what i missed.
my server is:
first create the table with the same command as in the docs:
next i create the btree index
and get next plan:
Execution time: 260.939 ms
and now drop the btree index and create bloom:
get new plan:
```
dev=# EXPLAIN ANALYZE SELECT * FROM tbloom WHERE i2 = 898732 AND i5 = 123451;
QUERY
my server is:
SHOW server_version;
server_version
-------------------------------
10.6 (Debian 10.6-1.pgdg90+1)
dev=# show random_page_cost;
random_page_cost
------------------
4first create the table with the same command as in the docs:
dev=# CREATE TABLE tbloom AS
SELECT
(random() * 1000000)::int as i1,
(random() * 1000000)::int as i2,
(random() * 1000000)::int as i3,
(random() * 1000000)::int as i4,
(random() * 1000000)::int as i5,
(random() * 1000000)::int as i6
FROM
generate_series(1,10000000);next i create the btree index
dev=# CREATE index btreeidx ON tbloom (i1, i2, i3, i4, i5, i6);
CREATE INDEXand get next plan:
dev=# EXPLAIN ANALYZE SELECT * FROM tbloom WHERE i2 = 898732 AND i5 = 123451;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
Gather (cost=1000.00..127195.10 rows=1 width=24) (actual time=258.963..260.900 rows=0 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Seq Scan on tbloom (cost=0.00..126195.00 rows=1 width=24) (actual time=255.446..255.446 rows=0 loops=3)
Filter: ((i2 = 898732) AND (i5 = 123451))
Rows Removed by Filter: 3333333
Planning time: 0.412 ms
Execution time: 260.939 msExecution time: 260.939 ms
and now drop the btree index and create bloom:
dev=# DROP INDEX btreeidx;
DROP INDEX
dev=# CREATE INDEX bloomidx ON tbloom USING bloom (i1, i2, i3, i4, i5, i6);
CREATE INDEXget new plan:
```
dev=# EXPLAIN ANALYZE SELECT * FROM tbloom WHERE i2 = 898732 AND i5 = 123451;
QUERY
Solution
When bloom was introduced in 9.6, parallel query had just been introduced and is was turned off by default. Bloom appeared to be better than a non-parallel sequential scan in the example given. But when you can do a parallel seq scan, it appears better than using a bloom index does. It is not actually better as can be verified by turning off parallel query with
The example code was not updated for when parallel query was turned on by default, in v10, so it no longer works as advertised.
Note that your example never achieved any index usage at all, so you can't really draw any conclusions about which index is better for that scenario.
set max_parallel_workers_per_gather TO 0 and looking at the actual executions speeds, but the planner thinks the parallel seq scan will be better. It looks like maybe the cost estimation part of bloom could use some work.The example code was not updated for when parallel query was turned on by default, in v10, so it no longer works as advertised.
Note that your example never achieved any index usage at all, so you can't really draw any conclusions about which index is better for that scenario.
Context
StackExchange Database Administrators Q#229413, answer score: 4
Revisions (0)
No revisions yet.