patternsqlMinor
Query performance of index scans slower than parallel seq scan on Postgres
Viewed 0 times
scanpostgresseqqueryslowerthanscansparallelperformanceindex
Problem
I've been trying to debug a performance problem on PostgreSQL 10 on AWS Aurora.
Specifically I'm trying to understand performance problems on "cold" queries where the buffer cache does not contain cached pages.
I have a table with an index and when the planner uses the index the query is substantially slower (spending more time on IO from what I can tell). However, when I drop an index the query is faster, it uses a parallel sequential scan.
The Schema of the table:
The index DDL is:
And the query is:
Check the following two explains - note that both explains show minimal or no buffer hits. One shows the execution with the index and the other shows execution after the index has been dropped:
No index -> https://explain.depesz.com/s/N4uo
With index -> https://explain.depesz.com/s/69WS
The plan that includes the index scan takes 193 seconds vs 43 seconds for a sequential scan.
Why is step 2 in the "With-index" plan so slow? It reads in only 118372 buffers (about 118 MB of data), y
Specifically I'm trying to understand performance problems on "cold" queries where the buffer cache does not contain cached pages.
I have a table with an index and when the planner uses the index the query is substantially slower (spending more time on IO from what I can tell). However, when I drop an index the query is faster, it uses a parallel sequential scan.
The Schema of the table:
CREATE TABLE max_test_2018_3
(
timestamp_ timestamp,
person_alias varchar(128),
visitor_id varchar(128),
session_id varchar(24),
page_title varchar(1024),
location_host varchar(256),
location_path varchar(1024),
location_query varchar(1024),
location_fragment varchar(1024),
referrer_host varchar(256),
referrer_path varchar(1024),
referrer_query varchar(1024),
referrer_fragment varchar(1024),
duration integer,
location_secure boolean,
referrer_secure boolean,
id uuid,
person_id uuid,
custom_properties jsonb
);The index DDL is:
CREATE INDEX max_test_2018_3_ix ON max_test_2018_3
USING btree ( location_path varchar_pattern_ops);And the query is:
explain (analyze, buffers)
SELECT O.person_id, 1 as step1, min(timestamp_) AS timestamp_
FROM max_test_2018_3 O
WHERE O.location_path like '/tax-calculator%'
GROUP BY O.person_idCheck the following two explains - note that both explains show minimal or no buffer hits. One shows the execution with the index and the other shows execution after the index has been dropped:
No index -> https://explain.depesz.com/s/N4uo
With index -> https://explain.depesz.com/s/69WS
The plan that includes the index scan takes 193 seconds vs 43 seconds for a sequential scan.
Why is step 2 in the "With-index" plan so slow? It reads in only 118372 buffers (about 118 MB of data), y
Solution
All your time is spent doing table I/O, which is to be expected in a test with a cold cache.
The index scan takes longer, but only because it is not parallelized. The sequential scan uses more total time, but since it is run on 5 cores in parallel, it finishes faster.
PostgreSQL v10 can perform parallel index scans, so the remaining riddle is why it doesn't. A side riddle is why the sequential scan uses so many parallel workers.
I suspect that you messed up the configuration parameters. The relevant parameters are:
-
-
-
The storage parameter
-
Finally,
I'd expect that the index scan will be faster once you set these parameters to good values.
The index scan takes longer, but only because it is not parallelized. The sequential scan uses more total time, but since it is run on 5 cores in parallel, it finishes faster.
PostgreSQL v10 can perform parallel index scans, so the remaining riddle is why it doesn't. A side riddle is why the sequential scan uses so many parallel workers.
I suspect that you messed up the configuration parameters. The relevant parameters are:
-
max_parallel_workers_per_gather: limits how many workers can be used. You must have set this to 4 or more.-
min_parallel_table_scan_size: if a table is bigger than that, a parallel worker is planned. If the table size exceeds 3n-1 times that value, n parallel workers are planned. So either your table is very big, or you reduced the parameter. Alternatively:-
The storage parameter
parallel_workers on the table overrides the calculation based on min_parallel_table_scan_size as described above, so maybe you set that.-
Finally,
min_parallel_index_scan_size governs when a parallel index scan is considered. Either the index is small, or you lowered the parameter.I'd expect that the index scan will be faster once you set these parameters to good values.
Context
StackExchange Database Administrators Q#242918, answer score: 4
Revisions (0)
No revisions yet.