HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Find string that minimizes the sum of the edit distances to all other strings in set

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
sumedittheallthatfindminimizesdistancesotherstrings

Problem

I have a set of strings $S$ and I am using the edit-distance (Levenshtein) to measure the distance between all pairs.

Is there an algorithm for finding the string $x$ which minimizes the sum of the distances to all strings in $S$, that is

$\arg_x \min \sum_{s \in S} \text{edit-distance}(x,s)$

It seems like there should, but I can't find the right reference.

Solution

The problem is known as "median string problem" and it is NP-complete; some results can be found searching with Google; in particular "2-Approximation Algorithms for Median and Centre String Problems". If $x$ must be one of the points in $S$ then the problem becomes solvable in polynomial time.

Context

StackExchange Computer Science Q#2546, answer score: 4

Revisions (0)

No revisions yet.