patternMinor
String matching algorithm - check if a string matches a pattern
Viewed 0 times
matchespatternalgorithmcheckstringmatching
Problem
This looks like quite the challenge; given a pattern $P$ (of length $n$) and a string $S$ (of length $m$), how would you check whether the string matches the pattern? For instance:
My thoughts so far: This looks like a dynamic programming problem. We can define a boolean
$M(i, j)$ = True iff pattern $P[i:n]$ matches $S[j:m]$
We then need $M(0, 0).$ Note here that the substring notation is $P[i:n] = P_iP_{i+1}...P_n$.
I couldn't think of an efficient recurrence beyond this. It seems like I would have to introduce an extra mapping (from a substring of $P$ to a substring of $S$, indicating that the earlier "matches" the latter; for example in the first example, ("x", "foo") would be part of the mapping. You then replace "x" by "foo" in the substring of P). This would make it $O(2^n)$ or worse though.
If you don't see the intuitive meaning, here's a more rigorous definition: S matches P iff there exists a mapping $$T : P[i:j] \rightarrow S[a:b]$$ for $i, j \in [1, 2 ...n]$ and $a, b \in [1, 2 ...m]$, such that replacing $P[i:j]$ with $S[a:b]$ transforms P into S. For example, in the first case where $P$ = "xyx" and $S$ = "foobarfoo", $T$ is as follows:
Replacing all instances (in $P$) of 'x' with 'foo' and 'y' with 'bar', gives us 'foobarfoo' i.e. $S$.
Thus, $S$ matches $P.$
- If $P$ = "xyx" and $S$ = "foobarfoo" then $S$ matches $P.$
- If $P$ = "acca" and $S$ = "carbuscarbus" then $S$ does not match $P.$
My thoughts so far: This looks like a dynamic programming problem. We can define a boolean
$M(i, j)$ = True iff pattern $P[i:n]$ matches $S[j:m]$
We then need $M(0, 0).$ Note here that the substring notation is $P[i:n] = P_iP_{i+1}...P_n$.
I couldn't think of an efficient recurrence beyond this. It seems like I would have to introduce an extra mapping (from a substring of $P$ to a substring of $S$, indicating that the earlier "matches" the latter; for example in the first example, ("x", "foo") would be part of the mapping. You then replace "x" by "foo" in the substring of P). This would make it $O(2^n)$ or worse though.
If you don't see the intuitive meaning, here's a more rigorous definition: S matches P iff there exists a mapping $$T : P[i:j] \rightarrow S[a:b]$$ for $i, j \in [1, 2 ...n]$ and $a, b \in [1, 2 ...m]$, such that replacing $P[i:j]$ with $S[a:b]$ transforms P into S. For example, in the first case where $P$ = "xyx" and $S$ = "foobarfoo", $T$ is as follows:
- T('x') = 'foo'. {Here $i = 1, j = 1, a = 1, b = 3$}
- T('y') = 'bar'. {Here $i = 2, j = 2, a = 4, b = 6$}
Replacing all instances (in $P$) of 'x' with 'foo' and 'y' with 'bar', gives us 'foobarfoo' i.e. $S$.
Thus, $S$ matches $P.$
Solution
There are various algorithms for pattern matching in string.
Exact String matching algorithms
Approximate String Matching Algorithm
Approximate String matching is applied in text searching, pattern
recognition and signal processing applications. For a
text
find all the occurrences of pattern in the text whose edit
distance to the pattern is at most
between two strings is defined as minimum number of
character insertion, deletion and replacements needed to
make them equal.
Approximate string matching problem is solved with the
help of dynamic programming.
You can have a look through all these algorithms.
These algorithms will take optimum complexity like O(m),O(n+m) (in case of KMP for example).
Exact String matching algorithms
- Brute Force
- Rabin Karp
- Boyer-Moree
- KMP
- Aho Corasick etc.
Approximate String Matching Algorithm
Approximate String matching is applied in text searching, pattern
recognition and signal processing applications. For a
text
T[1..n] and pattern P[1...m], we are supposed tofind all the occurrences of pattern in the text whose edit
distance to the pattern is at most
K. The edit distancebetween two strings is defined as minimum number of
character insertion, deletion and replacements needed to
make them equal.
Approximate string matching problem is solved with the
help of dynamic programming.
You can have a look through all these algorithms.
These algorithms will take optimum complexity like O(m),O(n+m) (in case of KMP for example).
Context
StackExchange Computer Science Q#64072, answer score: 5
Revisions (0)
No revisions yet.