gotchasqlModerate
Why does this implicit join get planned differently than an explicit join?
Viewed 0 times
thisdifferentlywhyimplicitthanplannedexplicitjoingetdoes
Problem
In this answer I explain SQL-89s implicit syntax.
But I noticed different query plans while playing around:
Versus this:
Why would one of them show
Isn't the implicit syntax just rewritten to be the same as the explicit syntax?
But I noticed different query plans while playing around:
EXPLAIN ANALYZE
SELECT *
FROM (values(1)) AS t(x), (values(2)) AS g(y);
QUERY PLAN
------------------------------------------------------------------------------------
Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.002..0.002 rows=1 loops=1)
Planning time: 0.052 ms
Execution time: 0.020 ms
(3 rows)Versus this:
EXPLAIN ANALYZE
SELECT *
FROM (values(1)) AS t(x)
CROSS JOIN (values(2)) AS g(y);
QUERY PLAN
------------------------------------------------------------------------------------------------
Subquery Scan on g (cost=0.00..0.02 rows=1 width=4) (actual time=0.004..0.005 rows=1 loops=1)
-> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.002..0.002 rows=1 loops=1)
Planning time: 0.075 ms
Execution time: 0.027 ms
(4 rows)Why would one of them show
Subquery Scan?Isn't the implicit syntax just rewritten to be the same as the explicit syntax?
Solution
Isn't the implicit syntax just rewritten to be the same as the explicit syntax?
Not necessarily. You are building on slightly incorrect assumptions. Like I explained under the referenced question:
Comma-separated items in the
Obviously, the planner is smart enough to be able to treat
Simplifying the query plan is easier for a
Not necessarily. You are building on slightly incorrect assumptions. Like I explained under the referenced question:
- What does [FROM x, y] mean in Postgres?
Comma-separated items in the
FROM list are almost, but not quite identical to explicit CROSS JOIN notation. Explicit joins bind stronger. The query planner has to handle both cases differently in certain cases.Obviously, the planner is smart enough to be able to treat
VALUES expression with a single row expression with a simplified plan. We see a more complex plan for more than one row in the VALUES expression:EXPLAIN ANALYZE
SELECT *
FROM (VALUES (1), (2)) t(x)
, (VALUES (2), (3)) g(y);Nested Loop (cost=0.00..0.11 rows=4 width=8) (actual time=0.059..0.064 rows=4 loops=1)
-> Values Scan on "*VALUES*" (cost=0.00..0.03 rows=2 width=4) (actual time=0.004..0.004 rows=2 loops=1)
-> Materialize (cost=0.00..0.04 rows=2 width=4) (actual time=0.025..0.026 rows=2 loops=2)
-> Values Scan on "*VALUES*_1" (cost=0.00..0.03 rows=2 width=4) (actual time=0.001..0.002 rows=2 loops=1)Simplifying the query plan is easier for a
VALUES expression separated by comma(s). When bound by explicit join(s), Postgres needs to consider join conditions before combining with other comma-separated FROM items. I expect the "subqery scan" we see for this case is a (completely harmless) side effect of the more complex code path for this case.Code Snippets
EXPLAIN ANALYZE
SELECT *
FROM (VALUES (1), (2)) t(x)
, (VALUES (2), (3)) g(y);Nested Loop (cost=0.00..0.11 rows=4 width=8) (actual time=0.059..0.064 rows=4 loops=1)
-> Values Scan on "*VALUES*" (cost=0.00..0.03 rows=2 width=4) (actual time=0.004..0.004 rows=2 loops=1)
-> Materialize (cost=0.00..0.04 rows=2 width=4) (actual time=0.025..0.026 rows=2 loops=2)
-> Values Scan on "*VALUES*_1" (cost=0.00..0.03 rows=2 width=4) (actual time=0.001..0.002 rows=2 loops=1)Context
StackExchange Database Administrators Q#167207, answer score: 10
Revisions (0)
No revisions yet.