patternsqlMinor
Postgres index for aggregate functions
Viewed 0 times
functionspostgresforindexaggregate
Problem
Query:
Index attempt:
Essentially, I want it to index the result of
SELECT
"places_place"."id"
FROM "places_place"
LEFT OUTER JOIN "units_unit" ON ("places_place"."id" = "units_unit"."place_id")
GROUP BY "places_place"."id"
HAVING SUM("units_unit"."quantity") >= 123Index attempt:
CREATE INDEX units_quantity_sum ON units_unit (SUM("units_unit"."quantity"));
-- ERROR: aggregate functions are not allowed in index expressionsEssentially, I want it to index the result of
SUM without storing the result in a separate column of the table. How can I create an index to do this (or is there a better way to optimize this query)?EXPLAIN ANALYZE of query with 10,000 rows in places_place and 25,000 in units_unit:HashAggregate (cost=2057.31..2157.33 rows=10002 width=4) (actual time=38.121..41.174 rows=7727 loops=1)
Group Key: places_place.id
Filter: (sum(units_unit.quantity) >= 5)
Rows Removed by Filter: 2275
-> Hash Right Join (cost=594.04..1932.22 rows=25018 width=6) (actual time=6.383..28.578 rows=26727 loops=1)
Hash Cond: (units_unit.place_id = places_place.id)
-> Seq Scan on units_unit (cost=0.00..994.18 rows=25018 width=6) (actual time=0.003..7.279 rows=25018 loops=1)
-> Hash (cost=469.02..469.02 rows=10002 width=4) (actual time=6.311..6.311 rows=10002 loops=1)
Buckets: 16384 Batches: 1 Memory Usage: 480kB
-> Seq Scan on places_place (cost=0.00..469.02 rows=10002 width=4) (actual time=0.007..3.560 rows=10002 loops=1)
Planning time: 0.584 ms
Execution time: 42.643 ms
Solution
You have two easy options
Both of these will allow you to cache the
Before you go down that route though PostgreSQL 9.6 enables parallel seq scans and aggregation which will increase performance. In fact, it's an ideal use case. If you just need this to be faster try setting
- You can use a
MATERIALIZED VIEW
- You can also use a
TRIGGERthat inserts another table.
Both of these will allow you to cache the
SUM(). I would go with the MATERIALIZED VIEW unless you need up to date changes all the time.Before you go down that route though PostgreSQL 9.6 enables parallel seq scans and aggregation which will increase performance. In fact, it's an ideal use case. If you just need this to be faster try setting
max_parallel_workers_per_gatherContext
StackExchange Database Administrators Q#164976, answer score: 3
Revisions (0)
No revisions yet.