principlesqlMinor
When to use lateral joins vs correlated subqueries in Postgres?
Viewed 0 times
postgrescorrelatedlateralwhenusesubqueriesjoins
Problem
These two queries:
and
give the same results.
When and why should I use one form over the other?
(The
The query plans are different, is one better than the other?
```
QUERY PLAN
-----------------------------------------------------------------------------------------------
Sort (cost=27.96..28.01 rows=19 width=37)
Sort Key: a.username, a.ts
-> Seq Scan on switches a (cost=0.00..27.55 rows=19 width=37)
Filter: (task IS NOT NULL)
SubPlan 1
-> Limit (cost=1.38..1.38 rows=1 width=8)
-> Sort (cost=1.38..1.39 rows=4 width=8)
Sort Key: switches.ts
-> Seq Scan on switches (cost=0.00..1.36 rows=4 width=8)
Filter: ((ts > a.ts) AND ((username)::text = (a.username)::text))
(10 rows)
QUERY PLAN
-----------------------------------------------------------------------------------------------------------
Sort (cost=28.34..2
SELECT username, location, task, ts_next - ts AS diff
FROM (
SELECT username, location, task, ts, (
SELECT ts AS ts_next
FROM switches
WHERE username = a.username
AND ts > a.ts
ORDER BY ts ASC LIMIT 1
)
FROM switches a
WHERE task IS NOT NULL
) c
ORDER BY username, ts
;and
SELECT username, location, task, ts_next - ts AS diff
FROM (
SELECT username, location, task, ts
FROM switches
WHERE task IS NOT NULL
) a
LEFT JOIN LATERAL (
SELECT ts AS ts_next
FROM switches
WHERE username = a.username
AND ts > a.ts
ORDER BY ts ASC LIMIT 1
) b
ON true
ORDER BY username, ts
;give the same results.
When and why should I use one form over the other?
(The
switches table is from a time/task tracking service, where only task switches are recorded, but task durations need to be calculated.)The query plans are different, is one better than the other?
```
QUERY PLAN
-----------------------------------------------------------------------------------------------
Sort (cost=27.96..28.01 rows=19 width=37)
Sort Key: a.username, a.ts
-> Seq Scan on switches a (cost=0.00..27.55 rows=19 width=37)
Filter: (task IS NOT NULL)
SubPlan 1
-> Limit (cost=1.38..1.38 rows=1 width=8)
-> Sort (cost=1.38..1.39 rows=4 width=8)
Sort Key: switches.ts
-> Seq Scan on switches (cost=0.00..1.36 rows=4 width=8)
Filter: ((ts > a.ts) AND ((username)::text = (a.username)::text))
(10 rows)
QUERY PLAN
-----------------------------------------------------------------------------------------------------------
Sort (cost=28.34..2
Solution
Using the subquery version would become awkward if you wanted to return more than one column or more than one row from the innermost subquery. If you never want to do that, then I find the first form easier to read. (Or at least it would be if you removed the outermost subquery and reformatted a bit--can't the expression and the ORDER BY be done directly in the middle query "c"?)
The difference in plan between the two is unlikely to have any practical significance.
The difference in plan between the two is unlikely to have any practical significance.
Context
StackExchange Database Administrators Q#237219, answer score: 3
Revisions (0)
No revisions yet.