HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Extremely slow query on indexed column in Postgres

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
postgrescolumnextremelyqueryslowindexed

Problem

I'm getting a extremely slow query on an indexed column. Given the query

SELECT * 
FROM orders 
WHERE shop_id = 3828 
ORDER BY updated_at desc 
LIMIT 1


explain analyze returned:

QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.43..594.45 rows=1 width=175) (actual time=202106.830..202106.831 rows=1 loops=1)
   ->  Index Scan Backward using index_orders_on_updated_at on orders  (cost=0.43..267901.54 rows=451 width=175) (actual time=202106.827..202106.827 rows=1 loops=1)
         Filter: (shop_id = 3828)
         Rows Removed by Filter: 1604818
 Planning time: 98.579 ms
 Execution time: 202127.514 ms
(6 rows)


The table description is:

Table "public.orders"
       Column       |            Type             |                           Modifiers
--------------------+-----------------------------+---------------------------------------------------------------
 id                 | integer                     | not null default nextval('orders_id_seq'::regclass)
 sent               | boolean                     | default false
 created_at         | timestamp without time zone |
 updated_at         | timestamp without time zone |
 name               | character varying(255)      |
 shop_id            | integer                     |
 recovered_at       | timestamp without time zone |
 total_price        | double precision            |
Indexes:
    "orders_pkey" PRIMARY KEY, btree (id)
    "index_orders_on_recovered_at" btree (recovered_at)
    "index_orders_on_shop_id" btree (shop_id)
    "index_orders_on__updated_at" btree (updated_at)


It's a Postgres server, running on an AWS RDS t2 micro instance.

The table has around 2.6 million rows.

Solution

There is a subtle problem hidden in your ORDER BY clause:

ORDER BY updated_at DESC


Would sort NULL values first. I assume you do not want that. Your column updated_at can be NULL (lacks a NOT NULL constraint). The missing constraint should probably be added. Your query should be fixed either way:

SELECT * 
FROM   orders 
WHERE  shop_id = 3828 
ORDER  BY updated_at DESC NULLS LAST
LIMIT  1;


The multicolumn index @Ste Bov mentioned should be adapted accordingly:

CREATE INDEX orders_shop_id_updated_at_idx ON orders (shop_id, updated_at DESC NULLS LAST);


Then you get a basic Index Scan instead of the (almost as fast) Index Scan Backward, and you won't get an additional index condition: Index Cond: (updated_at IS NOT NULL).

Related:

  • Unused index in range of dates query



  • Optimizing queries on a range of timestamps (two columns)



Asides

You can save a bit of wasted disk space by optimizing the sequence of columns for your big table (which makes everything a bit faster):

id                 | integer                     | not null default nextval( ...
shop_id            | integer                     |
sent               | boolean                     | default false
name               | varchar(255)                |
total_price        | double precision            |
recovered_at       | timestamp without time zone |
created_at         | timestamp without time zone |
updated_at         | timestamp without time zone |


See:

  • Configuring PostgreSQL for read performance



Add NOT NULL constraints to all columns that cannot be NULL.

Consider text or varchar instead of varchar(255), timestamptz instead of timestamp and integer for the price (as Cents) or numeric (for fractional numbers) which is an arbitrary precision type and stores your values exactly as given. Never use a lossy floating point type for a "price" or anything to do with money.

Code Snippets

ORDER BY updated_at DESC
SELECT * 
FROM   orders 
WHERE  shop_id = 3828 
ORDER  BY updated_at DESC NULLS LAST
LIMIT  1;
CREATE INDEX orders_shop_id_updated_at_idx ON orders (shop_id, updated_at DESC NULLS LAST);
id                 | integer                     | not null default nextval( ...
shop_id            | integer                     |
sent               | boolean                     | default false
name               | varchar(255)                |
total_price        | double precision            |
recovered_at       | timestamp without time zone |
created_at         | timestamp without time zone |
updated_at         | timestamp without time zone |

Context

StackExchange Database Administrators Q#112810, answer score: 8

Revisions (0)

No revisions yet.