patternsqlMinor
How many rows will be locked by SELECT … ORDER BY xxx LIMIT 1 FOR UPDATE?
Viewed 0 times
rowsorderupdatehowlimitlockedxxxwillformany
Problem
As was asked on stackoverflow but then for MySQL, I was wondering how this works in PostgreSQL. How many rows get locked when I do a "FOR UPDATE" in combination with a "LIMIT", a "ORDER BY" and some "WHERE"'s. And they update the returned row. I was hoping the "FOR UPDATE"-query would only lock ONE row but maybe I miss some implementations issue. (I'm planning to write some stress-test to reproduce this more reliably)
These queries cause a deadlock in my system:
But I do not understand why. (There is a unique index on transaction_path.)
These queries cause a deadlock in my system:
(session1) select field1, field2, ... from documents where transaction_path='some/path' for update
(session2) select field1, field2, ... from documents where (transaction_path is null) and queue_id=2 and next_pickup_ts<now() order by next_pickup_ts limit 1 for update
(session3) select field1, field2, ... from documents where (transaction_path is null) and queue_id=2 and next_pickup_ts<now() order by next_pickup_ts limit 1 for updateBut I do not understand why. (There is a unique index on transaction_path.)
Solution
If you use
Here, you can see the rows are potentially locked before the limit is applied.
Now, in practice, the first row that gets matched will generally get locked and returned, causing
In general, mixing row-locking and
If you must do it, you'll want to do something like:
where you force a target row to be found, and only then locked.
You should repeat the WHERE clause in the outer query to ensure that the row still matches the predicate after any lock-wait occurs. Otherwise what can happen is that the inner query finds the row, returns the ID, and you try to lock the row with that ID. PostgreSQL sees that someone else has a lock on that row, so it waits for that lock to be released. Then it re-checks the
Mixing row limits and row locking is hard. Building correct queuing systems that actually perform better than a single worker is even harder. Use an off the shelf queuing system and save yourself a lot of hassle.
(PostgreSQL 9.5 will have
EXPLAIN you'll see the ordering. For a random plan I concocted with some throwaway data and similar query (ORDER BY ... LIMIT 1 FOR UPDATE) I got the plan:Limit (cost=33.09..33.10 rows=1 width=26)
-> LockRows (cost=33.09..39.88 rows=543 width=26)
-> Sort (cost=33.09..34.45 rows=543 width=26)
Sort Key: t_id
-> Seq Scan on m (cost=0.00..30.38 rows=543 width=26)
Filter: (b_id <= 33)
(6 rows)Here, you can see the rows are potentially locked before the limit is applied.
Now, in practice, the first row that gets matched will generally get locked and returned, causing
LockRows on the later rows to get skipped ... but there's no guarantee of this. So you shouldn't rely on it.In general, mixing row-locking and
LIMIT is a bad idea that should generally be avoided.If you must do it, you'll want to do something like:
SELECT *
FROM my_table
WHERE id = (SELECT id FROM my_table
WHERE mywhereclause
ORDER BY ... LIMIT 1)
AND mywhereclause
FOR UPDATE;where you force a target row to be found, and only then locked.
You should repeat the WHERE clause in the outer query to ensure that the row still matches the predicate after any lock-wait occurs. Otherwise what can happen is that the inner query finds the row, returns the ID, and you try to lock the row with that ID. PostgreSQL sees that someone else has a lock on that row, so it waits for that lock to be released. Then it re-checks the
WHERE clause on the outer query to make sure it still matches (in case the row was deleted, or was UPDATEd). Because the inner query is an uncorrelated subquery it won't re-calculate it, though, so the inner WHERE clause doesn't get re-run, and you can get rows that no longer match the inner WHERE clause returned if you don't repeat it in the outer WHERE clause.Mixing row limits and row locking is hard. Building correct queuing systems that actually perform better than a single worker is even harder. Use an off the shelf queuing system and save yourself a lot of hassle.
(PostgreSQL 9.5 will have
SKIP LOCKED, which makes this way easier, but there won't be a stable release of it for over a year from now, so don't hold your breath.)Code Snippets
Limit (cost=33.09..33.10 rows=1 width=26)
-> LockRows (cost=33.09..39.88 rows=543 width=26)
-> Sort (cost=33.09..34.45 rows=543 width=26)
Sort Key: t_id
-> Seq Scan on m (cost=0.00..30.38 rows=543 width=26)
Filter: (b_id <= 33)
(6 rows)SELECT *
FROM my_table
WHERE id = (SELECT id FROM my_table
WHERE mywhereclause
ORDER BY ... LIMIT 1)
AND mywhereclause
FOR UPDATE;Context
StackExchange Database Administrators Q#83171, answer score: 9
Revisions (0)
No revisions yet.