HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Edit distance for huge strings with bounds

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
editwithdistancehugeforboundsstrings

Problem

My question is simple but I don't know if the answer is.


If you have two strings of length 10 million each, is there an algorithm that would allow you in
practice to compute their edit distance (Levenshtein) if you are told
it is at most 100?

Solution

You can use the usual dynamic programming algorithm. Instead of computing the edit distance between all prefixes of the input strings, compute only the edit distance between prefixes whose length is at most 100 apart. This reduces the number of entries to compute from $10^{14}$ to roughly $2\cdot 10^9$, which is feasible.

Context

StackExchange Computer Science Q#95942, answer score: 3

Revisions (0)

No revisions yet.