HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Bad execution plan when having partial index and big In-clause in Postgres

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
postgreshavingbadplanbigpartialwhenandindexclause

Problem

Postgres seems to always use sequential scan where it could have used a partial index to get index scan only. It only happens when an in-clause exceeds more than 100 elements.

Given the following table:

create table foo(id bigint primary key, bar bigint); 

insert into foo (id, bar) 
select g.id, case when id % 1000 = 0 then id else null end
from generate_series(1, 10000000) AS g (id) ;

--Create partial index
create unique index ix_foo_bar on foo(bar) where bar is not null;

analyze foo;


And the given the following query with a large in-statement:

explain analyze select count(*) from foo where bar in (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101);


The query plan shows a sequential scan. It is slow and and have a high cost:

QUERY PLAN                                                                             
------------------------------------
 Finalize Aggregate  (cost=612955.35..612955.36 rows=1 width=8) (actual time=254.605..254.605 rows=1 loops=1)
   ->  Gather  (cost=612955.13..612955.34 rows=2 width=8) (actual time=254.474..258.242 rows=3 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Partial Aggregate  (cost=611955.13..611955.14 rows=1 width=8) (actual time=247.743..247.744 rows=1 loops=3)
               ->  Parallel Seq Scan on foo  (cost=0.00..611955.03 rows=42 width=0) (actual time=247.740..247.740 rows=0 loops=3)
                     Filter: (bar = ANY ('{1,2,3,4,5(...)
                     Rows Removed by Filter: 3333333
 Planning Time: 0.867 ms
 Execution Time: 258.323 ms


set enable_seqscan has no effect - it still does a sequential scan.

If I add a 'not null' to the query, it uses the index:

```
explain analyze select count(*) from foo wh

Solution

This will be fixed in version 12, which will be released soon.

I think the summary here is that we are only willing to do so much work to try to prove a partial index can be used, because all queries have to go through that work even if they don't end up using the partial index. In this change, they just found a more efficient way to do that work in this particular case, and so no longer impose the 100-element limit to it.

Context

StackExchange Database Administrators Q#246421, answer score: 3

Revisions (0)

No revisions yet.