snippetsqlMinor
Avoiding a sort on an already clustered index for group by
Viewed 0 times
clusteredindexgroupalreadyforsortavoiding
Problem
On a table
Here is the query plan:
However, the table already has clustered index on
Why doesn't Postgres simply scan the table and compute the group by? Why does it need to sort on
How can I force Postgres to use the clustered index for the group by?
T with two fields, pid and did, the following query results in a seq. scan followed by a sort on pid: select count(did), pid
from T
group by pidHere is the query plan:
GroupAggregate (cost=21566127.88..22326004.09 rows=987621 width=8)
-> Sort (cost=21566127.88..21816127.88 rows=100000000 width=8)
Sort Key: pid
-> Seq Scan on tc (cost=0.00..1442478.00 rows=100000000 width=8)However, the table already has clustered index on
pid. Why doesn't Postgres simply scan the table and compute the group by? Why does it need to sort on
pid again?How can I force Postgres to use the clustered index for the group by?
Solution
As you have no
If you were on PostgreSQL 9.2 (or above) and you had an index on
WHERE clause, you (the query, that is) must scan the whole table to be able to compute the aggregate. Once it has all the data, it is much easier to sort them directly than to go to an index and match the rows and return them in order. This latter would involve further I/O which is slow compared to operation in the memory.If you were on PostgreSQL 9.2 (or above) and you had an index on
(pid, did) then the optimizer might have chosen an index scan instead.Context
StackExchange Database Administrators Q#46997, answer score: 3
Revisions (0)
No revisions yet.