patternsqlMinor
Postgres CTE query for a single row not optimized
Viewed 0 times
postgresquerycteoptimizedsingleforrownot
Problem
From my database I am trying to get a weighted sum of a user's stats, and I will only be querying the table one or two users at a time, so I wrote it as a view.
Since it's a view, I pretend to calculate the sum for every single row in the table, and then I was hoping that the optimizer would realize when I am only asking for a single row and would optimize the query. However my query plan is massive, and is calculating 17 billion rows at its innermost point, where I think there should be at most a thousand.
Here is the query:
This is the query plan:
```
explain analyze select * from weighted_stats where player_id=76561197960269296
GroupAggregate (cost=53645.35..299471.72 rows=1 width=72) (actual time=1014.016..1014.016 rows=0 loops=1)
Group Key: kvs.id
CTE clf
-> Limit (cost=20.65..20.65 rows=1 width=84) (actual time=0.017..0.018 rows=1 loops=1)
-> Sort (cost=20.65..22.43 rows=710 width=84) (actual time=0.014..0.014 rows=1 loops=1)
Sort Key: classifiers.time_trained
Sort Method: quicksort Memory: 25kB
-> Seq Scan on classifiers (cost=0.00..17.10 rows=710 width=84) (actual time=0.003..0.005 rows=1 loops=1)
CTE kvs
-> Seq Scan on stats (cost=0.00..53572.18 rows=10318000 width=722) (actual time=0.037..530.337 rows=336036 loops=1)
CTE weights
-> Nested Loop (cost=0.00..20.02 rows=1000 width=64) (actual time=0.036..0.046 rows=2 loops=1)
-> CTE Scan on clf (cost=0.00..0.02 rows=1 width
Since it's a view, I pretend to calculate the sum for every single row in the table, and then I was hoping that the optimizer would realize when I am only asking for a single row and would optimize the query. However my query plan is massive, and is calculating 17 billion rows at its innermost point, where I think there should be at most a thousand.
Here is the query:
CREATE OR REPLACE VIEW weighted_stats AS
WITH
clf AS (SELECT * FROM classifiers order by time_trained desc limit 1),
weights AS (SELECT kv.key, kv.value from clf, each(clf.weights) AS kv),
kvs AS (
SELECT stats.player_id, kv.key, kv.value FROM
stats, each(stats.hstore_column) AS kv),
SELECT
stats.player_id,
SUM(kvs.value :: numeric * weights.value :: numeric) AS stats
FROM
kvs JOIN weights USING (key)
GROUP BY kvs.player_id;This is the query plan:
```
explain analyze select * from weighted_stats where player_id=76561197960269296
GroupAggregate (cost=53645.35..299471.72 rows=1 width=72) (actual time=1014.016..1014.016 rows=0 loops=1)
Group Key: kvs.id
CTE clf
-> Limit (cost=20.65..20.65 rows=1 width=84) (actual time=0.017..0.018 rows=1 loops=1)
-> Sort (cost=20.65..22.43 rows=710 width=84) (actual time=0.014..0.014 rows=1 loops=1)
Sort Key: classifiers.time_trained
Sort Method: quicksort Memory: 25kB
-> Seq Scan on classifiers (cost=0.00..17.10 rows=710 width=84) (actual time=0.003..0.005 rows=1 loops=1)
CTE kvs
-> Seq Scan on stats (cost=0.00..53572.18 rows=10318000 width=722) (actual time=0.037..530.337 rows=336036 loops=1)
CTE weights
-> Nested Loop (cost=0.00..20.02 rows=1000 width=64) (actual time=0.036..0.046 rows=2 loops=1)
-> CTE Scan on clf (cost=0.00..0.02 rows=1 width
Solution
Common Table Expressions are treated as an “optimization fence” by PostgreSQL: It will never push down predicates from the main query into the CTE nor collapse any joins across the CTE boundary. Instead, it will generally evaluate the whole CTE as it is, materializing the result; the main query will then access the temporary table generated from the CTE.
So yes, your query would likely benefit from converting the CTE into a subquery.
Note that an actual view (created by CREATE VIEW) does not act as an optimization fence. The view’s definition will be included in the query using it, then optimized as usual. For CTEs, there has been discussion about making the optimization fence behaviour optional so that they could be used “just” to make queries more readable. However, as of version 9.5, this hasn’t been implemented yet.
So yes, your query would likely benefit from converting the CTE into a subquery.
Note that an actual view (created by CREATE VIEW) does not act as an optimization fence. The view’s definition will be included in the query using it, then optimized as usual. For CTEs, there has been discussion about making the optimization fence behaviour optional so that they could be used “just” to make queries more readable. However, as of version 9.5, this hasn’t been implemented yet.
Context
StackExchange Database Administrators Q#127828, answer score: 7
Revisions (0)
No revisions yet.