patternsqlMinor
Extremely slow query on indexed column in Postgres
Viewed 0 times
postgrescolumnextremelyqueryslowindexed
Problem
I'm getting a extremely slow query on an indexed column. Given the query
The table description is:
It's a Postgres server, running on an AWS RDS t2 micro instance.
The table has around 2.6 million rows.
SELECT *
FROM orders
WHERE shop_id = 3828
ORDER BY updated_at desc
LIMIT 1explain analyze returned:QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.43..594.45 rows=1 width=175) (actual time=202106.830..202106.831 rows=1 loops=1)
-> Index Scan Backward using index_orders_on_updated_at on orders (cost=0.43..267901.54 rows=451 width=175) (actual time=202106.827..202106.827 rows=1 loops=1)
Filter: (shop_id = 3828)
Rows Removed by Filter: 1604818
Planning time: 98.579 ms
Execution time: 202127.514 ms
(6 rows)The table description is:
Table "public.orders"
Column | Type | Modifiers
--------------------+-----------------------------+---------------------------------------------------------------
id | integer | not null default nextval('orders_id_seq'::regclass)
sent | boolean | default false
created_at | timestamp without time zone |
updated_at | timestamp without time zone |
name | character varying(255) |
shop_id | integer |
recovered_at | timestamp without time zone |
total_price | double precision |
Indexes:
"orders_pkey" PRIMARY KEY, btree (id)
"index_orders_on_recovered_at" btree (recovered_at)
"index_orders_on_shop_id" btree (shop_id)
"index_orders_on__updated_at" btree (updated_at)It's a Postgres server, running on an AWS RDS t2 micro instance.
The table has around 2.6 million rows.
Solution
There is a subtle problem hidden in your
Would sort NULL values first. I assume you do not want that. Your column
The multicolumn index @Ste Bov mentioned should be adapted accordingly:
Then you get a basic
Related:
Asides
You can save a bit of wasted disk space by optimizing the sequence of columns for your big table (which makes everything a bit faster):
See:
Add
Consider
ORDER BY clause:ORDER BY updated_at DESCWould sort NULL values first. I assume you do not want that. Your column
updated_at can be NULL (lacks a NOT NULL constraint). The missing constraint should probably be added. Your query should be fixed either way:SELECT *
FROM orders
WHERE shop_id = 3828
ORDER BY updated_at DESC NULLS LAST
LIMIT 1;The multicolumn index @Ste Bov mentioned should be adapted accordingly:
CREATE INDEX orders_shop_id_updated_at_idx ON orders (shop_id, updated_at DESC NULLS LAST);Then you get a basic
Index Scan instead of the (almost as fast) Index Scan Backward, and you won't get an additional index condition: Index Cond: (updated_at IS NOT NULL).Related:
- Unused index in range of dates query
- Optimizing queries on a range of timestamps (two columns)
Asides
You can save a bit of wasted disk space by optimizing the sequence of columns for your big table (which makes everything a bit faster):
id | integer | not null default nextval( ...
shop_id | integer |
sent | boolean | default false
name | varchar(255) |
total_price | double precision |
recovered_at | timestamp without time zone |
created_at | timestamp without time zone |
updated_at | timestamp without time zone |See:
- Configuring PostgreSQL for read performance
Add
NOT NULL constraints to all columns that cannot be NULL.Consider
text or varchar instead of varchar(255), timestamptz instead of timestamp and integer for the price (as Cents) or numeric (for fractional numbers) which is an arbitrary precision type and stores your values exactly as given. Never use a lossy floating point type for a "price" or anything to do with money.Code Snippets
ORDER BY updated_at DESCSELECT *
FROM orders
WHERE shop_id = 3828
ORDER BY updated_at DESC NULLS LAST
LIMIT 1;CREATE INDEX orders_shop_id_updated_at_idx ON orders (shop_id, updated_at DESC NULLS LAST);id | integer | not null default nextval( ...
shop_id | integer |
sent | boolean | default false
name | varchar(255) |
total_price | double precision |
recovered_at | timestamp without time zone |
created_at | timestamp without time zone |
updated_at | timestamp without time zone |Context
StackExchange Database Administrators Q#112810, answer score: 8
Revisions (0)
No revisions yet.