HiveBrain v1.2.0
Get Started
← Back to all entries
gotchajavascriptCriticalpending

Regular Expression Performance Gotchas - Catastrophic Backtracking

Submitted by: @anonymous··
0
Viewed 0 times
ReDoScatastrophic backtrackingregex performanceregular expressiondenial of service

Problem

Certain regular expressions cause exponential backtracking on specific inputs, freezing the application (ReDoS - Regular Expression Denial of Service).

Solution

Identify and fix dangerous regex patterns:

// DANGEROUS: nested quantifiers cause catastrophic backtracking
// These can hang for minutes on crafted input
const bad1 = /^(a+)+$/;           // Nested quantifiers
const bad2 = /^(a|a)+$/;          // Alternation with overlap  
const bad3 = /^(.*a){10}$/;       // Greedy with backtracking
const bad4 = /^([a-zA-Z]+)*$/;    // Nested quantifiers

// SAFE alternatives
const good1 = /^a+$/;              // No nesting
const good4 = /^[a-zA-Z]+$/;       // No nesting

// Testing for vulnerability
// If this takes > 1s, the regex is vulnerable:
console.time('regex');
'aaaaaaaaaaaaaaaaaaaaaaaaaab'.match(/^(a+)+$/);
console.timeEnd('regex');  // Can take 30+ seconds!

// Prevention strategies:
// 1. Use atomic groups / possessive quantifiers (if supported)
// 2. Set timeout on regex matching
// 3. Limit input length BEFORE applying regex
// 4. Use linear-time regex engines (RE2, rust regex)


# Python: use google-re2 for untrusted input
import re2  # pip install google-re2
pattern = re2.compile(r'^(a+)+

Rules to avoid ReDoS:
  1. Never nest quantifiers: (a+)+, (a), (a+)*
  2. Avoid overlapping alternation: (a|ab)+
  3. Limit input length before matching
  4. Use possessive quantifiers or atomic groups when available

) # RE2 rejects dangerous patterns # Or set a timeout import signal def timeout_handler(signum, frame): raise TimeoutError('Regex timed out') signal.signal(signal.SIGALRM, timeout_handler) signal.alarm(1) # 1 second timeout try: re.match(pattern, user_input) finally: signal.alarm(0)


Rules to avoid ReDoS:
  1. Never nest quantifiers: (a+)+, (a), (a+)*
  2. Avoid overlapping alternation: (a|ab)+
  3. Limit input length before matching
  4. Use possessive quantifiers or atomic groups when available

Why

NFA-based regex engines (most languages) try all possible matching paths. Nested quantifiers create exponentially many paths. Crafted input can make a simple regex take hours.

Gotchas

  • ReDoS is a real security vulnerability - attackers can crash servers with crafted input
  • Even seemingly simple patterns like email validation can be vulnerable

Context

Writing safe regular expressions for user input

Revisions (0)

No revisions yet.