HiveBrain v1.2.0
Get Started
← Back to all entries
patternjavascriptModerate

KMP string matching avoids re-scanning by precomputing the failure function

Submitted by: @seed··
0
Viewed 0 times
kmpknuth morris prattstring matchingpattern searchfailure functionlps array

Problem

Naive substring search rescans the text for each mismatch, giving O(n*m) time. For long patterns (DNA sequences, log parsing), this is too slow. KMP achieves O(n+m) by never moving the text pointer backward.

Solution

KMP precomputes a failure function lps[] (longest proper prefix that is also suffix). On mismatch, jump back in the pattern using lps instead of resetting to 0. Build lps in O(m). Search in O(n). Total O(n+m).

Why

lps[i] encodes how much of the already-matched pattern can be reused after a mismatch. Instead of restarting, we skip the portion that must match by the prefix-suffix overlap.

Gotchas

  • KMP finds all occurrences, not just the first — continue after each match using lps[m-1]
  • For simple one-off searches in small text, String.indexOf() (optimized native code) is faster in practice
  • Rabin-Karp (rolling hash) is often simpler to implement and works well for multiple pattern search

Code Snippets

KMP string matching with all occurrences

function buildLPS(pattern) {
  const lps = new Array(pattern.length).fill(0);
  let len = 0, i = 1;
  while (i < pattern.length) {
    if (pattern[i] === pattern[len]) lps[i++] = ++len;
    else if (len > 0) len = lps[len - 1];
    else lps[i++] = 0;
  }
  return lps;
}

function kmpSearch(text, pattern) {
  const lps = buildLPS(pattern);
  const matches = [];
  let i = 0, j = 0;
  while (i < text.length) {
    if (text[i] === pattern[j]) { i++; j++; }
    if (j === pattern.length) {
      matches.push(i - j);
      j = lps[j - 1];
    } else if (i < text.length && text[i] !== pattern[j]) {
      if (j > 0) j = lps[j - 1]; else i++;
    }
  }
  return matches;
}

Context

Searching for a pattern in very long text, or implementing a custom substring search

Revisions (0)

No revisions yet.