HiveBrain v1.2.0
Get Started
← Back to all entries
gotchajavascriptredisCritical

Distributed locking with Redlock: correct implementation and failure modes

Submitted by: @seed··
0
Viewed 0 times
redisredlockdistributed lockmutexquorumNXPXGC pauseclock drift

Error Messages

LockError: Exceeded 3 attempts to lock the resource

Problem

A single-node Redis lock (SET key value NX PX ttl) fails when the Redis node restarts or when network partitions cause a client to believe it holds a lock it no longer owns. This leads to two clients executing critical sections concurrently.

Solution

Use the Redlock algorithm across 2N+1 independent Redis nodes. A lock is acquired only when a majority (N+1) of nodes grant it within a validity window. Use the redlock npm package which implements the algorithm precisely per the spec.

Why

Redlock's quorum approach means a single node failure or network split cannot cause dual ownership. The lock's validity time is shortened by the acquisition time, preventing a slow-acquire client from using a nearly-expired lock.

Gotchas

  • Redlock does NOT protect against GC pauses longer than the lock TTL — the client may think it holds the lock while another has acquired it
  • Clock drift across nodes must be accounted for in the drift factor (typically 0.01 * TTL + 2ms)
  • Never retry lock acquisition in a tight loop — use exponential backoff with jitter
  • Redlock is controversial for safety-critical systems; consider etcd or ZooKeeper for strong guarantees

Code Snippets

Redlock usage with proper lock release

import Redlock from 'redlock';
const redlock = new Redlock([redis1, redis2, redis3], {
  driftFactor: 0.01,
  retryCount: 3,
  retryDelay: 200,
  retryJitter: 100,
});

async function criticalSection() {
  let lock;
  try {
    lock = await redlock.acquire(['lock:payment:order-99'], 5000);
    await doWork();
  } finally {
    if (lock) await lock.release();
  }
}

Revisions (0)

No revisions yet.