patternModerate
Find the longest repeated pattern in a string
Viewed 0 times
therepeatedfindlongeststringpattern
Problem
I'm looking for an efficient algorithm to find the longest repeated pattern in a string.
For example, consider the following string of numbers:
As you can see,
The repeated string should not contain any re
any idea rather than brute-force?
For example, consider the following string of numbers:
5431428571428571428571428571427623874534.As you can see,
142857142857 is the longest pattern which is repeated for a couple of times (at least twice) in this string.The repeated string should not contain any re
any idea rather than brute-force?
Solution
The problem is surprisingly non-trivial. First, two brute force algorithms. A square ("repeated pattern") is given by its length $\ell$ and position $p$, and takes time $O(\ell)$ to verify. If we go over all $\ell$ and $p$, we obtain an $O(n^3)$ algorithm. We can improve on that by first looping over $\ell$, and then scanning the string with two running pointers at a distance of $\ell$. In this way, one can verify whether a square of length $2\ell$ exists in linear time, giving a total running time of $O(n^2)$.
Kolpakov and Kucherov developed an algorithm for finding all maximal repeats in a word in time $O(n)$ [1], and their algorithm can be used to find all maximal squares in time $O(n)$. A repeat is a subword of the form $w^kx$, where $k \geq 2$ and $x$ is a proper prefix of $w$. The largest square contained in that repeat is $(w^{\lfloor k/2 \rfloor})^2$. Using this formula, given all maximal repeats in a word (of which there are only $O(n)$ many), one can find the largest square.
[1] Kolpakov, R., & Kucherov, G. (1999). Finding maximal repetitions in a word in linear time. In Foundations of Computer Science, 1999. 40th Annual Symposium on (pp. 596-604). IEEE.
Kolpakov and Kucherov developed an algorithm for finding all maximal repeats in a word in time $O(n)$ [1], and their algorithm can be used to find all maximal squares in time $O(n)$. A repeat is a subword of the form $w^kx$, where $k \geq 2$ and $x$ is a proper prefix of $w$. The largest square contained in that repeat is $(w^{\lfloor k/2 \rfloor})^2$. Using this formula, given all maximal repeats in a word (of which there are only $O(n)$ many), one can find the largest square.
[1] Kolpakov, R., & Kucherov, G. (1999). Finding maximal repetitions in a word in linear time. In Foundations of Computer Science, 1999. 40th Annual Symposium on (pp. 596-604). IEEE.
Context
StackExchange Computer Science Q#6776, answer score: 16
Revisions (0)
No revisions yet.