HiveBrain v1.2.0
Get Started
← Back to all entries
principlesqlMinor

When to use lateral joins vs correlated subqueries in Postgres?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
postgrescorrelatedlateralwhenusesubqueriesjoins

Problem

These two queries:

SELECT username, location, task, ts_next - ts AS diff
FROM (
        SELECT username, location, task, ts, (
                SELECT ts AS ts_next
                FROM switches
                WHERE username = a.username
                AND ts > a.ts
                ORDER BY ts ASC LIMIT 1
        )
        FROM switches a
        WHERE task IS NOT NULL
) c
ORDER BY username, ts
;


and

SELECT username, location, task, ts_next - ts AS diff
FROM (
        SELECT username, location, task, ts
        FROM switches
        WHERE task IS NOT NULL
) a
LEFT JOIN LATERAL (
        SELECT ts AS ts_next
        FROM switches
        WHERE username = a.username
        AND ts > a.ts
        ORDER BY ts ASC LIMIT 1
) b
ON true
ORDER BY username, ts
;


give the same results.

When and why should I use one form over the other?

(The switches table is from a time/task tracking service, where only task switches are recorded, but task durations need to be calculated.)

The query plans are different, is one better than the other?

```
QUERY PLAN
-----------------------------------------------------------------------------------------------
Sort (cost=27.96..28.01 rows=19 width=37)
Sort Key: a.username, a.ts
-> Seq Scan on switches a (cost=0.00..27.55 rows=19 width=37)
Filter: (task IS NOT NULL)
SubPlan 1
-> Limit (cost=1.38..1.38 rows=1 width=8)
-> Sort (cost=1.38..1.39 rows=4 width=8)
Sort Key: switches.ts
-> Seq Scan on switches (cost=0.00..1.36 rows=4 width=8)
Filter: ((ts > a.ts) AND ((username)::text = (a.username)::text))
(10 rows)

QUERY PLAN
-----------------------------------------------------------------------------------------------------------
Sort (cost=28.34..2

Solution

Using the subquery version would become awkward if you wanted to return more than one column or more than one row from the innermost subquery. If you never want to do that, then I find the first form easier to read. (Or at least it would be if you removed the outermost subquery and reformatted a bit--can't the expression and the ORDER BY be done directly in the middle query "c"?)

The difference in plan between the two is unlikely to have any practical significance.

Context

StackExchange Database Administrators Q#237219, answer score: 3

Revisions (0)

No revisions yet.