patternjavascriptModerate
Search string for a substring - indexOf
Viewed 0 times
indexofsearchsubstringforstring
Problem
I am trying to write my own implementation of
Here's a test:
indexOf for learning purposes. I have an implementation of as follows. I'm sure a more efficient and shorter implementation exists.function _indexOf(needle, haystack) {
var n = needle.split("");
var h = haystack.split("");
var n_len = n.length;
var h_len = h.length;
for (var i = 0; i = n_len)
return i - n_len;
}
}
return -1;
}Here's a test:
assert(_indexOf("ced","ccceccecedeeced")==7) // trueSolution
Your code is pretty neat, and well structured. I like the variable names. There are a number of problems to point out, though...
You have some bugs.... like:
Even when you resolve the above issues, your algorithm is considered 'naive'. There are two commonly used algorithms that are considered better. The Rabin–Karp algorithm is simpler to implement because you just create a hash of the needle, and then a matching rolling-hash in the haystack. If you find the hash in the haystack you can confirm a match, and return, in what is effectively \$O(n+m )\$. A more complicated, but also more efficient algorithm is the Boyer-Moore algorithm which creates a "jump table" from the needle, which allows you to 'skip' values in the haystack based on the details in the needle. It is efficient, and, the longer the needle, the more you can skip.
- splitting the string is a huge "cheat".... if you're going to use regex, you may as well just match for the substring.
- splitting the strings is also slow. Just use an index in to the string and the
string[indx]operator. You can remove thehandnandh_lenandn_lenvariables as well then.
- your outer loop can terminate at
h_len - n_len- there's no need to loop to the end when a match would be impossible
You have some bugs.... like:
- your needle cannot contain the same letter as the start letter in a another place... For example, your code will fail to find
cec. This is because you do weird logic here:if (h[i] == n[0]) ...I am not sure why that's needed.... other than to reset the search?
- related to the above, you cannot backtrack and reset a search in the event that the search pattern starts part way through a partial match....
Even when you resolve the above issues, your algorithm is considered 'naive'. There are two commonly used algorithms that are considered better. The Rabin–Karp algorithm is simpler to implement because you just create a hash of the needle, and then a matching rolling-hash in the haystack. If you find the hash in the haystack you can confirm a match, and return, in what is effectively \$O(n+m )\$. A more complicated, but also more efficient algorithm is the Boyer-Moore algorithm which creates a "jump table" from the needle, which allows you to 'skip' values in the haystack based on the details in the needle. It is efficient, and, the longer the needle, the more you can skip.
Context
StackExchange Code Review Q#93256, answer score: 10
Revisions (0)
No revisions yet.