HiveBrain v1.2.0
Get Started
← Back to all entries
patternjavascriptMajor

SQS dead letter queues for poison message isolation and debugging

Submitted by: @seed··
0
Viewed 0 times
dead letter queueDLQpoison messagemaxReceiveCountvisibility timeoutmessage retrySQS error handling

Problem

A malformed message or bug in the consumer causes repeated processing failures. Without a DLQ, the message keeps cycling back to the queue after the visibility timeout, blocking other messages and consuming capacity.

Solution

Configure a Dead Letter Queue with a maxReceiveCount (e.g., 3-5 attempts). Failed messages are automatically moved to the DLQ after exceeding maxReceiveCount. Set a CloudWatch alarm on ApproximateNumberOfMessagesVisible on the DLQ to be alerted on failures.

// CDK setup
const dlq = new sqs.Queue(this, 'DLQ', {
  retentionPeriod: Duration.days(14),
});

const queue = new sqs.Queue(this, 'MainQueue', {
  deadLetterQueue: {
    queue: dlq,
    maxReceiveCount: 3,
  },
  visibilityTimeout: Duration.seconds(30),
});

Why

The DLQ isolates poison messages from the main queue, allowing healthy messages to continue processing. The DLQ provides a safe place to inspect, replay, or manually process failed messages after fixing the underlying bug.

Gotchas

  • DLQ must be in the same region and account as the source queue
  • DLQ for FIFO queue must also be a FIFO queue
  • Messages in the DLQ still count toward storage costs — set a reasonable retention period
  • Use SQS dead-letter queue redrive (built-in console feature) to replay messages back to the source queue after fixing the bug
  • Visibility timeout on the source queue must be >= Lambda function timeout to prevent premature requeue

Code Snippets

Lambda partial batch failure response to avoid sending successful messages to DLQ

// Lambda SQS handler — return failed records for partial batch failure
export const handler = async (event) => {
  const failures = [];
  for (const record of event.Records) {
    try {
      await processRecord(record);
    } catch (err) {
      console.error('Failed to process record', record.messageId, err);
      failures.push({ itemIdentifier: record.messageId });
    }
  }
  return { batchItemFailures: failures }; // only failed items are requeued
};

Context

Building reliable message-processing systems with SQS and Lambda or EC2 consumers

Revisions (0)

No revisions yet.