patternsqlMinor
Very simple read-only SELECT not run in parallel in postrgresql
Viewed 0 times
simplereadpostrgresqlparallelveryselectnotonlyrun
Problem
I am trying to do a simple SELECT on a large table, but it is not being run in parallel:
If I do a max() operation it runs in parallel:
Why is the SELECT not done in parallel?
The table has several billion rows. The DBA has done some indexing - I don't know what that really means, but I have read that index scans are not parallel in PostgreSQL 9.6.8, but it is doing a sequential scan anyway.
I am using PostgreSQL 9.6.8 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-9), 64-bit
The table has structure:
```
my_db=> \d my_table
id | bigint | not null default nextval('my_table_id_seq'::regclass)
datetime | timestam
my_db=> explain (analyze, buffers) select value from my_table;
Seq Scan on my_table (cost=0.00..94433357.34 rows=5441810434 width=4) (actual time=0.029..1519364.787 rows=4897721407 loops=1)
Buffers: shared hit=1126394 read=38888859 dirtied=1926396 written=284
Planning time: 2.277 ms
Execution time: 2468388.489 msIf I do a max() operation it runs in parallel:
my_db=> explain (analyze, buffers) select max(value) from my_table;
Finalize Aggregate (cost=48552456.41..48552456.42 rows=1 width=4) (actual time=1347796.993..1347796.993 rows=1 loops=1)
Buffers: shared hit=1173785 read=38870105 dirtied=30813
-> Gather (cost=48552455.58..48552456.39 rows=8 width=4) (actual time=1347796.953..1347796.969 rows=9 loops=1)
Workers Planned: 8
Workers Launched: 8
Buffers: shared hit=1173785 read=38870105 dirtied=30813
-> Partial Aggregate (cost=48551455.58..48551455.59 rows=1 width=4) (actual time=1347790.862..1347790.862 rows=1 loops=9)
Buffers: shared hit=1172674 read=38870104 dirtied=30813
-> Parallel Seq Scan on my_table (cost=0.00..46849720.06 rows=680694206 width=4) (actual time=0.213..1048428.195 rows=544206420 loops=9)
Buffers: shared hit=1172674 read=38870104 dirtied=30813
Planning time: 0.761 ms
Execution time: 1347985.127 msWhy is the SELECT not done in parallel?
The table has several billion rows. The DBA has done some indexing - I don't know what that really means, but I have read that index scans are not parallel in PostgreSQL 9.6.8, but it is doing a sequential scan anyway.
I am using PostgreSQL 9.6.8 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-9), 64-bit
The table has structure:
```
my_db=> \d my_table
id | bigint | not null default nextval('my_table_id_seq'::regclass)
datetime | timestam
Solution
The query leader is the only process that communicates with the front end. All the SELECTed data for the non-aggregated version of the query has to be marshalled up from the parallel workers to the query leader. That is expensive, so using parallel query here would almost certainly be counter-productive. You might be able to get one anyway by setting parallel_tuple_cost and parallel_setup_cost to zero.
When doing the count(*) aggregate, each parallel worker only needs to send up one piece of information, a partial count, to the leader. This is more obviously worthwhile.
When doing the count(*) aggregate, each parallel worker only needs to send up one piece of information, a partial count, to the leader. This is more obviously worthwhile.
Context
StackExchange Database Administrators Q#210205, answer score: 3
Revisions (0)
No revisions yet.