patternsqlMinor

Slow left join lateral in subquery

Submitted by: @import:stackexchange-dba·Mar 10, 2026·

Viewed 0 times

leftjoinslowlateralsubquery

Problem

Solution

You have a lot of redundant work in your query. The first CTE last_rent retrieves the "first" row per month for every unit in the whole table and materializes the result, while only a fraction of that is reused in the later CTE unit_rent_month. Plus, you repeat the work there.

You can basically remove the first CTE last_rent completely. And while being at it, remove all CTEs. All you actually need is a single LATERAL subquery. This radically rewritten query should do the same, much faster (untested):

SELECT u.id AS unit_id, u.community_id, u.bedrooms, u.bathrooms
     , last_rent.amount_cents
     , date_trunc('month', mon.dt)::date AS period
FROM   units u
CROSS  JOIN generate_series(timestamp '2019-02-01'  -- switched to lower bound
                          , timestamp '2019-02-01'  -- and see below
                          , interval '1 month') mon(dt)
LEFT   JOIN LATERAL (
   SELECT li.amount_cents
   FROM   leases     l
   JOIN   line_items li ON l.id = li.lease_id
   WHERE  l.unit_id = u.id
   AND    l.community_id = u.community_id  -- redundant?
   AND    li.name  = 'RNT'
   AND    li.start = mon.dt  -- also required, see below
   ORDER  BY li.start DESC    -- NULLS LAST ?
   LIMIT  1
   ) last_rent ON true
WHERE  u.community_id IN (X)  -- ?
ORDER  BY mon.dt, u.id;

Your example produces only a single month. The intention obviously is to optionally produce several consecutive months at once. I changed the generate_series() expression subtly - here is why:

Generating time series between two dates in PostgreSQL

I also switched to providing lower bounds like you did in your updated query. That's slightly more convenient.

The way you repeated the condition community_id IN (X) could result in different community_id matching between units and leases if the IN clause has more than one expression. I tightened that to match exactly with AND l.community_id = u.community_id. Adapt if that's not as desired. Seems like the column leases.community_id is redundant to begin with, then you can drop this predicate completely.

This predicate makes a logical difference:

AND    li.start >= mon.dt

You dropped it in your updated query, but maybe you should keep it to always get the latest amount_cents for the month of the result row - NULL if there is none. It's what your original query did.

If you drop it, you get the latest amount_cents in the given date range, possibly from a previous month. I.e. you can get a different result for a month if you change the lower bound of the date range.

Related:

PostgreSQL: Generate a series of dates for each group in a table

Optimize GROUP BY query to retrieve latest record per user

Indexes

Depending on undisclosed details, you might get faster results, yet, with a different query style, or by adding one or more multicolumn or partial indexes. Shots in the dark, this might get you very fast index-only scans:

CREATE INDEX foo ON leases (unit_id, id);

CREATE INDEX bar ON line_items (lease_id, start DESC, amount_cents)
WHERE name = 'RNT';

Or, slightly better yet, in Postgres 11:

CREATE INDEX bar ON line_items (lease_id, start DESC) INCLUDING (amount_cents)
WHERE name = 'RNT';

WHERE name = 'RNT' typically only pays if 'RNT' is not too common.

Related (consider the preconditions mentioned there):

Can Postgres use an index-only scan for this query with joined tables?

Code Snippets

SELECT u.id AS unit_id, u.community_id, u.bedrooms, u.bathrooms
     , last_rent.amount_cents
     , date_trunc('month', mon.dt)::date AS period
FROM   units u
CROSS  JOIN generate_series(timestamp '2019-02-01'  -- switched to lower bound
                          , timestamp '2019-02-01'  -- and see below
                          , interval '1 month') mon(dt)
LEFT   JOIN LATERAL (
   SELECT li.amount_cents
   FROM   leases     l
   JOIN   line_items li ON l.id = li.lease_id
   WHERE  l.unit_id = u.id
   AND    l.community_id = u.community_id  -- redundant?
   AND    li.name  = 'RNT'
   AND    li.start <  mon.dt + interval '1 month'
   AND    li.start >= mon.dt  -- also required, see below
   ORDER  BY li.start DESC    -- NULLS LAST ?
   LIMIT  1
   ) last_rent ON true
WHERE  u.community_id IN (X)  -- ?
ORDER  BY mon.dt, u.id;

AND    li.start >= mon.dt

CREATE INDEX foo ON leases (unit_id, id);

CREATE INDEX bar ON line_items (lease_id, start DESC, amount_cents)
WHERE name = 'RNT';

CREATE INDEX bar ON line_items (lease_id, start DESC) INCLUDING (amount_cents)
WHERE name = 'RNT';

Context

StackExchange Database Administrators Q#230516, answer score: 3

Revisions (0)

No revisions yet.