HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Postgres 10 - Query slows down with order by

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
orderpostgreswithqueryslowsdown

Problem

I am running a query like

select id from students where school_id='67153fb1-8f79-441d-a747-ca3778cf6d3d';


on table that looks like

Table "public.students"
          Column       |            Type             |             Modifiers              
    -------------------+-----------------------------+------------------------------------
     id                | uuid                        | not null default gen_random_uuid()
     school_id        | uuid                        | 
Indexes:
    "students_pkey" PRIMARY KEY, btree (id)
    "students_school_id_idx" btree (school_id)


The query plan for the select statement with just where looks like below-

explain select id from students where school_id='67153fb1-8f79-441d-a747-ca3778cf6d3d';
                                            QUERY PLAN                                            
--------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on students  (cost=581.83..83357.10 rows=24954 width=16)
   Recheck Cond: (school_id = '67153fb1-8f79-441d-a747-ca3778cf6d3d'::uuid)
   ->  Bitmap Index Scan on students_school_id_idx  (cost=0.00..575.59 rows=24954 width=0)
         Index Cond: (school_id = '67153fb1-8f79-441d-a747-ca3778cf6d3d'::uuid)


This is fairly fast.

Now we add order by to the query with id that degrades the query.(Such a query is generated by Rails like student.first with some condition)

```
explain select id from students where school_id='67153fb1-8f79-441d-a747-ca3778cf6d3d' order by id asc limit 1;
QUERY PLAN
------------------------------------------------------------------------------------------------------------
Limit (cost=0.43..488.51 rows=1 width=16)
-> Index Scan using students_pkey on students (cost=0.43..12179370.22 rows=24954 width=16)
Filter: (school_id = '67153fb1-8f79-44

Solution

PostgreSQL thinks that it will be faster avoiding the sort for the ORDER BY by scanning the rows in the sort order and discarding rows until it finds one with the right school_id.

There can be two reasons why this takes longer than expected:

-
The table statistics is off, and PostgreSQL overestimates the number of rows with that school_id.

Calculate new statistics, possibly with a higher value for default_statistics_target, to verify if that is the problem:

ANALYZE students;


-
The (many) rows with the correct school_id all happen to have a rather high id, so PostgreSQL has to scan way more rows than it bargained for until it finds a match.

In that case, you should modify the ORDER BY clause so that PostgreSQL cannot use the wrong index:

... ORDER BY id + 0

Code Snippets

ANALYZE students;
... ORDER BY id + 0

Context

StackExchange Database Administrators Q#258503, answer score: 3

Revisions (0)

No revisions yet.