patternsqlModerate
Left join without duplicate rows
Viewed 0 times
leftwithoutrowsduplicatejoin
Problem
I have two tables called
Here, I am getting the latest one by
record and record_history. For each record, there could be more than one history. They can be joined by id and record_id. I want to get all the record's entries with recent record_history data. I have created the query like,SELECT rec.id, rec.name, rech1.data AS last_history_data
FROM record rec
LEFT OUTER JOIN record_history rech1 ON (rec.id = rech1.record_id)
LEFT OUTER JOIN record_history rech2 ON (rec.id = rech2.record_id AND rech2.ts > rech1.ts)
WHERE rech2.id IS NULL
ORDER BY rec.id DESCHere, I am getting the latest one by
ts. This works as long as there are no duplicate ts entries. If the recent timestamp is repeated in record_history, this query returns more than one row for a record. How can we apply the limit here on the left join to restrict duplicate rows?Solution
Unless you are in a very old version of Postgres, you don't need the double join. You can get the same result by using a
The duplicate results can be avoided in your method by adding a second condition besides the
Regarding how to do this with the original method (2 joins and
By the way, you could replace that second
LATERAL join.The duplicate results can be avoided in your method by adding a second condition besides the
rec.id = rech2.record_id. With the LATERAL join method, the use of LIMIT is avoiding it anyway. There can be only 1 row returned from the lateral subquery. We can add a second condition so the choice is deterministic (from the two or more rows with same timestamp):SELECT rec.id, rec.name, rech.data AS last_history_data
FROM record AS rec
LEFT OUTER JOIN LATERAL
( SELECT rech.data
FROM record_history AS rech
WHERE rec.id = rech.record_id
ORDER BY rech.ts DESC
-- ,rech.id DESC -- optional
LIMIT 1
) AS rech
ON TRUE
ORDER BY rec.id DESC ;Regarding how to do this with the original method (2 joins and
IS NULL check), you could change the ON condition - assuming there is an id column in history table so that (id) or at least (ts, id) is unique:LEFT OUTER JOIN record_history rech2
ON rec.id = rech2.record_id
AND (rech2.ts > rech1.ts OR rech2.ts = rech1.ts AND rech2.id > rech1.id)By the way, you could replace that second
LEFT join and IS NULL check with a NOT EXISTS subquery with same results and possibly similar efficiency (or even with a NOT IN subquery although that needs extra care for nullable columns, not recommended).Code Snippets
SELECT rec.id, rec.name, rech.data AS last_history_data
FROM record AS rec
LEFT OUTER JOIN LATERAL
( SELECT rech.data
FROM record_history AS rech
WHERE rec.id = rech.record_id
ORDER BY rech.ts DESC
-- ,rech.id DESC -- optional
LIMIT 1
) AS rech
ON TRUE
ORDER BY rec.id DESC ;LEFT OUTER JOIN record_history rech2
ON rec.id = rech2.record_id
AND (rech2.ts > rech1.ts OR rech2.ts = rech1.ts AND rech2.id > rech1.id)Context
StackExchange Database Administrators Q#214885, answer score: 14
Revisions (0)
No revisions yet.