patterncppMinor
Edit Distance Between Two Strings
Viewed 0 times
editdistancetwobetweenstrings
Problem
#include
#include
#include
using namespace std;
inline size_t min(size_t x, size_t y, size_t z)
{
if (x > M(NA + 1, vector(NB + 1));
for (size_t a = 0; a <= NA; ++a)
M[a][0] = a;
for (size_t b = 0; b <= NB; ++b)
M[0][b] = b;
for (size_t a = 1; a <= NA; ++a)
for (size_t b = 1; b <= NB; ++b)
{
size_t x = M[a-1][b] + 1;
size_t y = M[a][b-1] + 1;
size_t z = M[a-1][b-1] + (A[a-1] == B[b-1] ? 0 : 2);
M[a][b] = min(x,y,z);
}
return M[A.size()][B.size()];
}
int main()
{
assert(edit_distance("ISLANDER", "SLANDER") == 1);
assert(edit_distance("MART", "KARMA") == 5);
assert(edit_distance("KITTEN", "SITTING") == 5);
assert(edit_distance("INTENTION", "EXECUTION") == 8);
}Solution
-
Implementation:
-
It’s generally discouraged to open namespaces (
-
Three-ways min can be written entirely in terms of
-
“Bug”: the edit distance generally defines the cost for a substitution as 1 (since Levenshtein defined three equivalent edit operations), not 2 which you used in your code;
-
Algorithmic: Your code needs O(n * m) space. There’s a not too hard to implement variant which requires just O(min{n, m}) due to the fact that for the computation we only need to save the previous column (or row). In fact, with a bit of trickery you can make do with a single vector and two additional variables to save all the active values.
But apart from that it’s a very clean and efficient implementation.
Implementation:
-
It’s generally discouraged to open namespaces (
using namespace …;). Instead, import single symbols (using std::vector;) or qualify symbols explicitly.-
Three-ways min can be written entirely in terms of
std::min:template
T min(T const& a, T const& b, T const& c) {
return std::min(std::min(a, b), c);
}-
“Bug”: the edit distance generally defines the cost for a substitution as 1 (since Levenshtein defined three equivalent edit operations), not 2 which you used in your code;
-
Algorithmic: Your code needs O(n * m) space. There’s a not too hard to implement variant which requires just O(min{n, m}) due to the fact that for the computation we only need to save the previous column (or row). In fact, with a bit of trickery you can make do with a single vector and two additional variables to save all the active values.
But apart from that it’s a very clean and efficient implementation.
Code Snippets
template <typename T>
T min(T const& a, T const& b, T const& c) {
return std::min(std::min(a, b), c);
}Context
StackExchange Code Review Q#10130, answer score: 9
Revisions (0)
No revisions yet.