HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlModerate

PostgreSQL query very slow when subquery added

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
postgresqladdedqueryslowsubquerywhenvery

Problem

I have a relatively simple query on a table with 1.5M rows:
SELECT mtid FROM publication
WHERE mtid IN (9762715) OR last_modifier=21321
LIMIT 5000;


EXPLAIN ANALYZE output:

Limit (cost=8.84..12.86 rows=1 width=8) (actual time=0.985..0.986 rows=1 loops=1)
-> Bitmap Heap Scan on publication (cost=8.84..12.86 rows=1 width=8) (actual time=0.984..0.985 rows=1 loops=1)
Recheck Cond: ((mtid = 9762715) OR (last_modifier = 21321))
-> BitmapOr (cost=8.84..8.84 rows=1 width=0) (actual time=0.971..0.971 rows=0 loops=1)
-> Bitmap Index Scan on publication_pkey (cost=0.00..4.42 rows=1 width=0) (actual time=0.295..0.295 rows=1 loops=1)
Index Cond: (mtid = 9762715)
-> Bitmap Index Scan on publication_last_modifier_btree (cost=0.00..4.42 rows=1 width=0) (actual time=0.674..0.674 rows=0 loops=1)
Index Cond: (last_modifier = 21321)
Total runtime: 1.027 ms


So far so good, fast and uses the available indexes.

Now, if I modify a query just a bit, the result will be:
SELECT mtid FROM publication
WHERE mtid IN (SELECT 9762715) OR last_modifier=21321
LIMIT 5000;


The EXPLAIN ANALYZE output is:

Limit (cost=0.01..2347.74 rows=5000 width=8) (actual time=2735.891..2841.398 rows=1 loops=1)
-> Seq Scan on publication (cost=0.01..349652.84 rows=744661 width=8) (actual time=2735.888..2841.393 rows=1 loops=1)
Filter: ((hashed SubPlan 1) OR (last_modifier = 21321))
SubPlan 1
-> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.001 rows=1 loops=1)
Total runtime: 2841.442 ms


Not so fast, and using seq scan...

Of course, the original query run by the application is a bit more complex, and even slower, and of course the hibernate-generated original is not (SELECT 9762715), but the slowness is there even for that (SELECT 9762715)! The query is generated by hibernate, so it is quite a challenge to change them, and some features are not availab

Solution

My colleague has found a way to change the query so that it needs a simple rewrite and does what it needs to do, i.e. doing the subselect in one step, and then doing the further operations on the result:

SELECT mtid FROM publication 
WHERE 
  mtid = ANY( (SELECT ARRAY(SELECT 9762715))::bigint[] )
  OR last_modifier=21321
LIMIT 5000;


The explain analyze now is:

Limit  (cost=92.58..9442.38 rows=2478 width=8) (actual time=0.071..0.074 rows=1 loops=1)
   InitPlan 2 (returns $1)
     ->  Result  (cost=0.01..0.02 rows=1 width=0) (actual time=0.010..0.011 rows=1 loops=1)
           InitPlan 1 (returns $0)
             ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.002 rows=1 loops=1)
   ->  Bitmap Heap Scan on publication  (cost=92.56..9442.36 rows=2478 width=8) (actual time=0.069..0.070 rows=1 loops=1)
         Recheck Cond: ((mtid = ANY (($1)::bigint[])) OR (last_modifier = 21321))
         Heap Blocks: exact=1
         ->  BitmapOr  (cost=92.56..92.56 rows=2478 width=0) (actual time=0.060..0.060 rows=0 loops=1)
               ->  Bitmap Index Scan on publication_pkey  (cost=0.00..44.38 rows=10 width=0) (actual time=0.046..0.046 rows=1 loops=1)
                     Index Cond: (mtid = ANY (($1)::bigint[]))
               ->  Bitmap Index Scan on publication_last_modifier_btree  (cost=0.00..46.94 rows=2468 width=0) (actual time=0.011..0.011 rows=0 loops=1)
                     Index Cond: (last_modifier = 21321)
 Planning time: 0.704 ms
 Execution time: 0.153 ms


It seems we can create a simple parser that finds and rewrites all subselects this way, and add it to a hibernate hook to manipulate the native query.

Code Snippets

SELECT mtid FROM publication 
WHERE 
  mtid = ANY( (SELECT ARRAY(SELECT 9762715))::bigint[] )
  OR last_modifier=21321
LIMIT 5000;
Limit  (cost=92.58..9442.38 rows=2478 width=8) (actual time=0.071..0.074 rows=1 loops=1)
   InitPlan 2 (returns $1)
     ->  Result  (cost=0.01..0.02 rows=1 width=0) (actual time=0.010..0.011 rows=1 loops=1)
           InitPlan 1 (returns $0)
             ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.002 rows=1 loops=1)
   ->  Bitmap Heap Scan on publication  (cost=92.56..9442.36 rows=2478 width=8) (actual time=0.069..0.070 rows=1 loops=1)
         Recheck Cond: ((mtid = ANY (($1)::bigint[])) OR (last_modifier = 21321))
         Heap Blocks: exact=1
         ->  BitmapOr  (cost=92.56..92.56 rows=2478 width=0) (actual time=0.060..0.060 rows=0 loops=1)
               ->  Bitmap Index Scan on publication_pkey  (cost=0.00..44.38 rows=10 width=0) (actual time=0.046..0.046 rows=1 loops=1)
                     Index Cond: (mtid = ANY (($1)::bigint[]))
               ->  Bitmap Index Scan on publication_last_modifier_btree  (cost=0.00..46.94 rows=2468 width=0) (actual time=0.011..0.011 rows=0 loops=1)
                     Index Cond: (last_modifier = 21321)
 Planning time: 0.704 ms
 Execution time: 0.153 ms

Context

StackExchange Database Administrators Q#114337, answer score: 11

Revisions (0)

No revisions yet.