gotchasqlMinor
Why does PostgreSQL aggregate result in the Index or Seq scan outputting all columns of the table?
Viewed 0 times
postgresqlresultwhythescancolumnsalloutputtingseqtable
Problem
Doing a query like this:
results in a query plan like this:
The result is correct, but ideally, for performance reasons I would like to have weight omitted in the output of index/seq scan - what am I doing wrong?
This is the output of EXPLAIN ANALYZE VERBOSE:
I run the following queries based on Craig's comment:
and
This is the table definition
```
CREATE TABLE x_2009
(
c_a character varying(70),
c_b character varying(70),
c_c character varying(70),
c_d character varying(70),
c_e character varying(70),
c_f double precision,
SELECT fruit, avg(cost) FROM fruit_table GROUP BY fruitresults in a query plan like this:
Sort (cost=3713873.41..3713874.46 rows=420 width=16)
Output: fruit, (avg(cost))
Sort Key: fruittable.fruit
-> HashAggregate (cost=3713849.87..3713855.12 rows=420 width=16)
Output: fruit, (avg(cost))
-> Seq Scan on public.fruit (cost=0.00..3677849.91 rows=7199991 width=16)
Output: fruit, weight, cost, ...The result is correct, but ideally, for performance reasons I would like to have weight omitted in the output of index/seq scan - what am I doing wrong?
This is the output of EXPLAIN ANALYZE VERBOSE:
Sort (cost=3713873.41..3713874.46 rows=420 width=16) (actual time=31156.940..31156.958 rows=612 loops=1)
Output: fruit, (avg(cost))
Sort Key: fruittable.fruit
Sort Method: quicksort Memory: 53kB
-> HashAggregate (cost=3713849.87..3713855.12 rows=420 width=16) (actual time=31156.272..31156.729 rows=612 loops=1)
Output: fruit, avg(cost)
-> Seq Scan on public.fruittable (cost=0.00..3677849.91 rows=7199991 width=16) (actual time=0.004..15050.636 rows=7199991 loops=1)
Output: fruit, weight, cost, ...
Total runtime: 31157.097 ms- EDIT
I run the following queries based on Craig's comment:
SELECT
pg_stat_get_live_tuples('fruittable'::regclass),
pg_stat_get_dead_tuples('fruittable'::regclass),
pg_relation_size('fruittable'::regclass),
pg_total_relation_size('fruittable'::regclass);
-- OUTPUT:
pg_stat_get_live_tuples: 0
pg_stat_get_dead_tuples: 0
pg_relation_size: 29539123200
pg_total_relation_size: 32118743040and
SELECT avg(pg_column_size(x)) FROM (SELECT * FROM fruittable LIMIT 1000) AS x;
-- OUTPUT
avg: 1354.68- EDIT
This is the table definition
```
CREATE TABLE x_2009
(
c_a character varying(70),
c_b character varying(70),
c_c character varying(70),
c_d character varying(70),
c_e character varying(70),
c_f double precision,
Solution
"Seq scans" means full table scan. It's always going to read all of the columns on your table. It's a full table scan. PostgreSQL uses row-based storage, so it can't just read one column at a time.
The reason your query takes so long is that you're reading 7.2 million rows in the table and aggregating them. Apparently, on a fairly slow computer.
The reason your query takes so long is that you're reading 7.2 million rows in the table and aggregating them. Apparently, on a fairly slow computer.
Context
StackExchange Database Administrators Q#25739, answer score: 3
Revisions (0)
No revisions yet.