patternsqlMajor
"Recheck Cond:" line in query plans with a bitmap index scan
Viewed 0 times
condscanlinewithqueryrecheckbitmapindexplans
Problem
This is a spin-off from comments to the previous question:
Using PostgreSQL 9.4, there always seems to be a
Like in the
Or in the output of
Does that mean index conditions have to be checked a second time after a bitmap index scan?
What else can we learn from the
- Postgres 9.4.4 query takes forever
Using PostgreSQL 9.4, there always seems to be a
Recheck Cond: line after bitmap index scans in query plans output by EXPLAIN.Like in the
EXPLAIN output of the referenced question:-> Bitmap Heap Scan on table_three (cost=2446.92..19686.74 rows=8159 width=7)
Recheck Cond: (("timestamp" > (now() - '30 days'::interval)) AND (client_id > 0))
-> BitmapAnd (cost=2446.92..2446.92 rows=8159 width=0)
-> Bitmap Index Scan on table_one_timestamp_idx (cost=0.00..1040.00 rows=79941 width=0)
Index Cond: ("timestamp" > (now() - '30 days'::interval))
-> Bitmap Index Scan on fki_table_three_client_id (cost=0.00..1406.05 rows=107978 width=0)
Index Cond: (client_id > 0)Or in the output of
EXPLAIN ANALYZE for a simple, huge table (with very little work_mem):EXPLAIN ANALYZE SELECT * FROM aa WHERE a BETWEEN 100000 AND 200000;Bitmap Heap Scan on aa (cost=107.68..4818.05 rows=5000 width=4) (actual time=27.629..213.606 rows=100001 loops=1)
Recheck Cond: ((a >= 100000) AND (a Bitmap Index Scan on aai (cost=0.00..106.43 rows=5000 width=0) (actual time=27.265..27.265 rows=100001 loops=1)
Index Cond: ((a >= 100000) AND (a <= 200000))Does that mean index conditions have to be checked a second time after a bitmap index scan?
What else can we learn from the
EXPLAIN output?Solution
As @Chris commented correctly on the referenced question:
a little investigation seems to indicate that the recheck condition is
always printed in the
http://www.postgresql.org/message-id/464F3C5D.2000700@enterprisedb.com
While this is all true and the core developer Heikki Linnakangas is a first class source, the post dates back to 2007 (Postgres 8.2). Here is a blog post by Michael Paquier with detailed explanation for Postgres 9.4, where the output of
The
From a total of 4425 data pages (blocks), 693 stored tuples exactly (including tuple pointers), while the other 3732 pages were lossy (just the data page) in the bitmap. That happens when
The index condition has to be rechecked for pages from the lossy share, since the bitmap only remembers which pages to fetch and not the exact tuples on the page. Not all tuples on the page will necessarily pass the index conditions, it's necessary to actually recheck the condition.
This is the thread on pgsql hackers where the new addition was discussed. The author Etsuro Fujita provides a formula for how to calculate the minimum
Additional line
In addition, when running with
This indicates how much of the underlying table (and index) was read from the cache (
There are more options. The manual about the
Specifically, include the number of shared blocks hit, read,
dirtied, and written, the number of local blocks hit, read, dirtied,
and written, and the number of temp blocks read and written.
Read on, there's more.
Here is the list of output options in the source code.
a little investigation seems to indicate that the recheck condition is
always printed in the
EXPLAIN, but is actually only performed whenwork_mem is small enough that the bitmap becomes lossy. Thoughts?http://www.postgresql.org/message-id/464F3C5D.2000700@enterprisedb.com
While this is all true and the core developer Heikki Linnakangas is a first class source, the post dates back to 2007 (Postgres 8.2). Here is a blog post by Michael Paquier with detailed explanation for Postgres 9.4, where the output of
EXPLAIN ANALYZE has been improved with more information.The
Recheck Cond: line is always there for bitmap index scans. The output of basic EXPLAIN won't tell us more. We get additional information from EXPLAIN ANALYZE as can be seen in the second quote in the question:Heap Blocks: exact=693 lossy=3732From a total of 4425 data pages (blocks), 693 stored tuples exactly (including tuple pointers), while the other 3732 pages were lossy (just the data page) in the bitmap. That happens when
work_mem is not big enough to store the whole bitmap built from the index scan exactly (lossless).The index condition has to be rechecked for pages from the lossy share, since the bitmap only remembers which pages to fetch and not the exact tuples on the page. Not all tuples on the page will necessarily pass the index conditions, it's necessary to actually recheck the condition.
This is the thread on pgsql hackers where the new addition was discussed. The author Etsuro Fujita provides a formula for how to calculate the minimum
work_mem to avoid lossy bitmap entries and ensuing condition rechecks. The calculation is not reliable for complex cases with multiple bitmap scans, so it was not used to output actual numbers from EXPLAIN. It can still serve as an estimate for simple cases.Additional line
BUFFERS:In addition, when running with
BUFFERS option: EXPLAIN (ANALYZE, BUFFERS) ... another line is added like:Buffers: shared hit=279 read=79This indicates how much of the underlying table (and index) was read from the cache (
shared hit=279) and how much had to be fetched from disk (read=79). If you repeat the query, the "read" part typically disappears for not-too-huge queries, because everything is cached now after the first call. The first call tells you how much was cached already. Subsequent calls show how much your cache can handle (currently).There are more options. The manual about the
BUFFERS option:Specifically, include the number of shared blocks hit, read,
dirtied, and written, the number of local blocks hit, read, dirtied,
and written, and the number of temp blocks read and written.
Read on, there's more.
Here is the list of output options in the source code.
Code Snippets
Heap Blocks: exact=693 lossy=3732Buffers: shared hit=279 read=79Context
StackExchange Database Administrators Q#106264, answer score: 25
Revisions (0)
No revisions yet.