patternpythonMinor
Naive implementation of KMP algorithm
Viewed 0 times
implementationkmpnaivealgorithm
Problem
After reading this answer to the question "High execution time to count overlapping substrings", I decided to implement the suggested Knuth-Morris-Pratt (KMP) algorithm. I used the pseudo-code listed on Wikipedia for the functions
However, when running it on some corner-cases, I have observed that it is a lot slower than the standard
The specific case I looked at is:
So the difference is about a factor 1000 for this input. This is probably due to the fact that the native one is written in C and this is written in Python, but I still wanted to see if I did anything stupid here or missed any obvious optimization.
kmp_table and kmp_search.However, when running it on some corner-cases, I have observed that it is a lot slower than the standard
str.find, which apparently uses a modified Boyer-Moore-Horspool algorithm and should thus have worse worst-case performance.The specific case I looked at is:
$ ipython -i kmp.py
In [1]: text = "A"*1000000 + "B"
In [2]: word = "A"*100 + "B"
In [3]: %timeit kmp_search(text, word)
1 loop, best of 3: 410 ms per loop
In [4}: %timeit text.find(word)
1000 loops, best of 3: 703 µs per loopSo the difference is about a factor 1000 for this input. This is probably due to the fact that the native one is written in C and this is written in Python, but I still wanted to see if I did anything stupid here or missed any obvious optimization.
def kmp_table(word):
table = [0] * len(word)
position, candidate = 2, 0
table[0] = -1
while position 0:
candidate = table[candidate]
else:
table[position] = 0
position += 1
return table
def kmp_search(text, word):
m, i = 0, 0
table = kmp_table(word)
while m + i -1:
m += i - table[i]
i = table[i]
else:
m += 1
i = 0
return len(text)Solution
Minor comments, but...
This sort of construct, an else which contains only an if-else chain, can be simply written as an elif-else chain.
These statements seem weird, why not first add one and then set?
else:
if table[i] > -1:
m += i - table[i]
i = table[i]
else:
m += 1
i = 0This sort of construct, an else which contains only an if-else chain, can be simply written as an elif-else chain.
elif table[i] > -1:
m += i - table[i]
i = table[i]
else:
m += 1
i = 0table[position] = candidate + 1
candidate += 1These statements seem weird, why not first add one and then set?
candidate += 1
table[position] = candidateCode Snippets
else:
if table[i] > -1:
m += i - table[i]
i = table[i]
else:
m += 1
i = 0elif table[i] > -1:
m += i - table[i]
i = table[i]
else:
m += 1
i = 0table[position] = candidate + 1
candidate += 1candidate += 1
table[position] = candidateContext
StackExchange Code Review Q#158396, answer score: 3
Revisions (0)
No revisions yet.