patternsqlMinor
Nested Loop with Materialize resulting in very slow query on postgresql 9.5
Viewed 0 times
postgresqlwithqueryloopslowmaterializeresultingnestedvery
Problem
I have an analytics database on Postgres 9.5
I'm taking advantage of constraint partitioning with table inheritance to split up events into monthly tables. The base (parent) table contains no rows.
I have a relatively simple query but the planner is coming up with a completely insane plan where it materializes within a nested loop as can be seen from the explain.
The query
The explain: https://explain.depesz.com/s/IoO
My theory is that table inheritance with constraint partitioning is tripping up the planner somehow. If I replace the base table with the monthly table for April I get a sane plan with good performance:
https://explain.depesz.com/s/GpbU
NOTE The slight difference from the above query - i'm using the child table instead of the base table (ie no inheritance at play)
All stats are up to date via vacuum analyze. The relevant schema is below:
```
CREATE TABLE event_page (
id CHAR(24) PRIMARY KEY,
timestamp_ TIMESTAMP NOT NULL,
person_alias VARCHAR(128) NOT NULL,
visitor_id VARCHAR(128) NOT NULL,
session_id VARCHAR(24) NOT NULL,
ip_address VARCHAR(64) ,
user_agent VARCHAR(256) ,
operating_system VARCHAR(64) ,
device_type VARCHAR(64) ,
browse
I'm taking advantage of constraint partitioning with table inheritance to split up events into monthly tables. The base (parent) table contains no rows.
I have a relatively simple query but the planner is coming up with a completely insane plan where it materializes within a nested loop as can be seen from the explain.
The query
select count(person_id) as thecount from (
select distinct A.person_id from event_page as O
join alias as A on (O.person_alias = A.alias)
where O.timestamp_ between '2017-4-28 04:00:00' and '2017-4-29 03:59:59.999'
and O.location_host = 'www.foo.com' AND O.location_path= '/ca/sale'
) as alias_3494697The explain: https://explain.depesz.com/s/IoO
My theory is that table inheritance with constraint partitioning is tripping up the planner somehow. If I replace the base table with the monthly table for April I get a sane plan with good performance:
https://explain.depesz.com/s/GpbU
NOTE The slight difference from the above query - i'm using the child table instead of the base table (ie no inheritance at play)
select count(person_id) as thecount from (
select distinct A.person_id from event_page_2017_4 as O
join alias as A on (O.person_alias = A.alias)
where O.timestamp_ between '2017-4-28 04:00:00' and '2017-4-29 03:59:59.999'
and O.location_host = 'www.foo.com' AND O.location_path= '/ca/sale'
) as alias_3494697All stats are up to date via vacuum analyze. The relevant schema is below:
```
CREATE TABLE event_page (
id CHAR(24) PRIMARY KEY,
timestamp_ TIMESTAMP NOT NULL,
person_alias VARCHAR(128) NOT NULL,
visitor_id VARCHAR(128) NOT NULL,
session_id VARCHAR(24) NOT NULL,
ip_address VARCHAR(64) ,
user_agent VARCHAR(256) ,
operating_system VARCHAR(64) ,
device_type VARCHAR(64) ,
browse
Solution
A couple of observations:
1.
The query plan for the slow query shows:
Seq Scan on event_page o (cost=10,000,000,000.00..10,000,000,000.00 rows=1 width=274)
... which indicates that you ran with
2.
Did you enable constraint_exclusion like the manual advises here:
as desired.
3.
The day boundaries in your query are not in sync with the partitioning. (But that has no immediate effect for the given query, see comments.)
The
There may be confusion with
Either way, the expression in your
Postgres timestamps are implemented as 8-byte integers allowing 6 decimal places. A row with
Details:
Plus, this does not play well with constraint exclusion. More partitions than necessary may have to be read.
Use instead:
You may want get "days" in your partitions in sync with your queries - 00:00 vs. 04:00 (or the evil 03:59:59.999) - and maybe use
4.
I see for the index-only scan on
Heap Fetches: 1918543
... which is exceptionally high. There may or may not be problems with the visibility map. (See also: Updating The Visibility Map) Did you run
There have been various improvements to
5.
If
(Or use the subquery like you had it, may be faster than
If
Why do you have
6.
You use the data type
-
Would index lookup be noticeably faster with char vs varchar when all values are 36 chars
-
What is the overhead for varchar(n)?
1.
The query plan for the slow query shows:
Seq Scan on event_page o (cost=10,000,000,000.00..10,000,000,000.00 rows=1 width=274)
... which indicates that you ran with
SET enable_seqscan = off;. And that means your Postgres version positively could not find any other way than the sequential scan.2.
Did you enable constraint_exclusion like the manual advises here:
- Ensure that the constraint_exclusion configuration parameter is not disabled in
postgresql.conf. If it is, queries will not be optimized
as desired.
3.
The day boundaries in your query are not in sync with the partitioning. (But that has no immediate effect for the given query, see comments.)
The
CHECK constraint reads:CHECK ( timestamp_ >= '2017-4-01'::timestamp AND timestamp_ < '2017-05-01'::timestamp )There may be confusion with
timestamp vs. timestamptz and / or time zones. Or something got lost in translation and the question is misleading in that respect.Either way, the expression in your
WHERE clause opens up a sneaky corner case (even if your day boundaries were in sync):where O.timestamp_ between '2017-4-28 04:00:00' and '2017-4-29 03:59:59.999' Postgres timestamps are implemented as 8-byte integers allowing 6 decimal places. A row with
'2017-4-29 03:59:59.9995' would not behave as expected. Details:
- Ignoring timezones altogether in Rails and PostgreSQL
Plus, this does not play well with constraint exclusion. More partitions than necessary may have to be read.
Use instead:
WHERE o.timestamp_ >= '2017-4-28 04:00' -- include lower bound
AND o.timestamp_ < '2017-4-29 04:00' -- exclude upper boundYou may want get "days" in your partitions in sync with your queries - 00:00 vs. 04:00 (or the evil 03:59:59.999) - and maybe use
timestamptz to begin with.4.
I see for the index-only scan on
alias:Heap Fetches: 1918543
... which is exceptionally high. There may or may not be problems with the visibility map. (See also: Updating The Visibility Map) Did you run
VACUUM FULL on the table(s)? (Compare this thread on plpgsql-performance.) Try plain VACUUM, which updates the visibility map properly. But that's all hard to say, since you seem to have been running with enable_seqscan = off. Not sure how this plays out in combination with all the other stuff at work here.There have been various improvements to
VACUUM for Postgres 9.6 (among other things), this may explain the difference - or what you observed in pg 9.6 is coincidence and due to other factors.5.
If
alias.alias is unique (and referential integrity can be assumed), the query can be simplified. Just count(DISTINCT o.person_alias) without even joining to table alias at all. Like:SELECT count(DISTINCT o.person_alias) AS thecount
FROM event_page o
WHERE o.timestamp_ >= '2017-4-28 04:00' -- include lower bound
AND o.timestamp_ < '2017-4-29 04:00' -- exclude upper bound
AND o.location_host = 'www.foo.com'
AND o.location_path = '/ca/sale';(Or use the subquery like you had it, may be faster than
count(DISTINCT ...).)If
alias.alias is not unique, your query may be wrong / ambiguous.Why do you have
person_alias in table event_page and not person_id to begin with?6.
You use the data type
CHAR(24) as PK for your tables, which is almost certainly a bad choice. Consider a serial or bigserial column or maybe a uuid. Numeric types are smaller and faster and not burdened by collation rules, varying byte length, etc. Related:-
Would index lookup be noticeably faster with char vs varchar when all values are 36 chars
-
What is the overhead for varchar(n)?
Code Snippets
CHECK ( timestamp_ >= '2017-4-01'::timestamp AND timestamp_ < '2017-05-01'::timestamp )where O.timestamp_ between '2017-4-28 04:00:00' and '2017-4-29 03:59:59.999'WHERE o.timestamp_ >= '2017-4-28 04:00' -- include lower bound
AND o.timestamp_ < '2017-4-29 04:00' -- exclude upper boundSELECT count(DISTINCT o.person_alias) AS thecount
FROM event_page o
WHERE o.timestamp_ >= '2017-4-28 04:00' -- include lower bound
AND o.timestamp_ < '2017-4-29 04:00' -- exclude upper bound
AND o.location_host = 'www.foo.com'
AND o.location_path = '/ca/sale';Context
StackExchange Database Administrators Q#172286, answer score: 4
Revisions (0)
No revisions yet.