HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlCritical

Postgres UPDATE ... LIMIT 1

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
limitupdatepostgres

Problem

I have a Postgres database which contains details on clusters of servers, such as server status ('active', 'standby' etc). Active servers at any time may need to fail over to a standby, and I don't care which standby is used in particular.

I want a database query to change the status of a standby - JUST ONE - and return the server IP that is to be used. The pick can be arbitrary: since the status of the server changes with the query, it doesn't matter which standby is selected.

Is it possible to limit my query to just one update?

Here is what I have so far:

UPDATE server_info SET status = 'active' 
WHERE status = 'standby' [[LIMIT 1???]] 
RETURNING server_ip;


Postgres doesn't like this. What could I do differently?

Solution

Without concurrent write access

Materialize a selection in a CTE (Common Table Expressions) and join to it in the FROM clause of the UPDATE.

WITH cte AS (
   SELECT server_ip          -- pk column or any (set of) unique column(s)
   FROM   server_info
   WHERE  status = 'standby'
   LIMIT  1                  -- arbitrary pick (cheapest)
   )
UPDATE server_info s
SET    status = 'active' 
FROM   cte
WHERE  s.server_ip = cte.server_ip
RETURNING s.server_ip;


I originally had a plain subquery here, but that can sidestep the LIMIT for certain query plans as Feike pointed out:

The planner may choose to generate a plan that executes a nested loop over the LIMITing subquery, causing more UPDATEs than LIMIT, e.g.:

Update on buganalysis [...] rows=5


->  Nested Loop
         ->  Seq Scan on buganalysis
         ->  Subquery Scan on sub [...] loops=11
               ->  Limit [...] rows=2
                     ->  LockRows
                           ->  Sort
                                 ->  Seq Scan on buganalysis


Reproducing test case

The way to fix the above was to wrap the LIMIT subquery in its own CTE, as the CTE is materialized it will not return different results on different iterations of the nested loop.

Or use a lowly correlated subquery for the simple case with LIMIT 1. Simpler, faster:

UPDATE server_info
SET    status = 'active' 
WHERE  server_ip = (
         SELECT server_ip
         FROM   server_info
         WHERE  status = 'standby'
         LIMIT  1
         )
RETURNING server_ip;


With concurrent write access

Assuming default isolation level READ COMMITTED for all of this. Stricter isolation levels (REPEATABLE READ and SERIALIZABLE) may still result in serialization errors. See:

  • SELECT … FOR UPDATE SKIP LOCKED in REPEATABLE READ transactions



Under concurrent write load, add FOR UPDATE SKIP LOCKED to lock the row to avoid race conditions. SKIP LOCKED was added in Postgres 9.5, for older versions see below. The manual:

With SKIP LOCKED, any selected rows that cannot be immediately locked
are skipped. Skipping locked rows provides an inconsistent view of the
data, so this is not suitable for general purpose work, but can be
used to avoid lock contention with multiple consumers accessing a
queue-like table.

UPDATE server_info
SET    status = 'active' 
WHERE  server_ip = (
         SELECT server_ip
         FROM   server_info
         WHERE  status = 'standby'
         LIMIT  1
         FOR    UPDATE SKIP LOCKED
         )
RETURNING server_ip;


If there is no qualifying, unlocked row left, nothing happens in this query (no row is updated) and you get an empty result. For uncritical operations that means you are done.

However, concurrent transactions may have locked rows, but then don't finish the update (ROLLBACK or other reasons). To be sure run a final check:

SELECT NOT EXISTS (
   SELECT FROM server_info
   WHERE  status = 'standby'
   );


SELECT also sees locked rows. Wile that doesn't return true, one or more rows are still unfinished and transactions could still be rolled back. (Or new rows have been added meanwhile.) Wait a bit, then loop the two steps: (UPDATE till you get no row back; SELECT ...) until you get true.

Related:

  • Atomic UPDATE .. SELECT in Postgres



Without SKIP LOCKED in PostgreSQL 9.4 or older

UPDATE server_info
SET    status = 'active' 
WHERE  server_ip = (
         SELECT server_ip
         FROM   server_info
         WHERE  status = 'standby'
         LIMIT  1
         FOR    UPDATE
         )
RETURNING server_ip;


Concurrent transactions trying to lock the same row are blocked until the first one releases its lock.

If the first was rolled back, the next transaction takes the lock and proceeds normally; others in the queue keep waiting.

If the first committed, the WHERE condition is re-evaluated and if it's not TRUE any more (status has changed) the CTE (somewhat surprisingly) returns no row. Nothing happens. That's the desired behavior when all transactions want to update the same row.

But not when each transaction wants to update the next row. And since we just want to update an arbitrary (or random) row, there is no point in waiting at all.

We can unblock the situation with the help of advisory locks:

UPDATE server_info
SET    status = 'active' 
WHERE  server_ip = (
         SELECT server_ip
         FROM   server_info
         WHERE  status = 'standby'
         AND    pg_try_advisory_xact_lock(id)
         LIMIT  1
         FOR    UPDATE
         )
RETURNING server_ip;


This way, the next row not locked yet will be updated. Each transaction gets a fresh row to work with. I had help from Czech Postgres Wiki for this trick.

id being any unique bigint column (or any type with an implicit cast like int4 or int2).

If advisory locks are in use for multiple tables in your database concurrently, disambiguate with `pg_try_advisory_

Code Snippets

WITH cte AS (
   SELECT server_ip          -- pk column or any (set of) unique column(s)
   FROM   server_info
   WHERE  status = 'standby'
   LIMIT  1                  -- arbitrary pick (cheapest)
   )
UPDATE server_info s
SET    status = 'active' 
FROM   cte
WHERE  s.server_ip = cte.server_ip
RETURNING s.server_ip;
Update on buganalysis [...] rows=5
->  Nested Loop
         ->  Seq Scan on buganalysis
         ->  Subquery Scan on sub [...] loops=11
               ->  Limit [...] rows=2
                     ->  LockRows
                           ->  Sort
                                 ->  Seq Scan on buganalysis
UPDATE server_info
SET    status = 'active' 
WHERE  server_ip = (
         SELECT server_ip
         FROM   server_info
         WHERE  status = 'standby'
         LIMIT  1
         )
RETURNING server_ip;
UPDATE server_info
SET    status = 'active' 
WHERE  server_ip = (
         SELECT server_ip
         FROM   server_info
         WHERE  status = 'standby'
         LIMIT  1
         FOR    UPDATE SKIP LOCKED
         )
RETURNING server_ip;

Context

StackExchange Database Administrators Q#69471, answer score: 204

Revisions (0)

No revisions yet.