patterncppModerateCanonical
strstr() implementation
Viewed 0 times
implementationstrstrstackoverflow
Problem
In this
strstr implementation, I am basically skipping the already matched and checked characters with an if else condition. Is this a right way of implementing it?char *strstrImp(char *input, char *find) {
if(*find=='\0')
return input;
if(*input=='\0')
return NULL;
char *start=input;
char *p;
while(*start!='\0')
{
p=start;
char *q=find;
while(*p!='\0' && *q!='\0' && *p==*q)
{
p++;
q++;
}
if(*q=='\0')
{
return start;
}
//update the starting position based on whether any character matched
if(p == start) //no match
start++;
else
start = p; //skip characters and start from the previously left place
}
return NULL;
}
int main()
{
char * res;
res = strstrImp("helloworld","low");
printf("%s",res);
}Solution
Const-correctness
Since you tagged this question as c++, you must take care of const-correctness:
There should be a
Incorrect optimization
Try this test case, which compares your implementation against the standard
Logic simplification
Since the
Next, notice that the
Next, let's turn our attention inside the outer loop.
Since
It turns out that the inner-loop condition can be simplified, because the
$$\begin{array}{ccc|c|c}
\text{p == q} & \text{q != '\0'} & \text{p != '\0'} & {\begin{array}{l}\text{p!='\0' &&}\\\text{q!='\0' &&}\\\text{p==q}\end{array}} & \begin{array}{l}\text{q != '\0' &&}\\\text{p == *q}\end{array} \\
\hline \\
\begin{array}{c}T\\T\end{array} & \begin{array}{c}T\\T\end{array} & \begin{array}{c}T\\F\end{array} & \begin{array}{c}T\\F\ (impossible)\end{array} & T \\
\hline \\
\begin{array}{c}T\\T\end{array} & \begin{array}{c}F\\F\end{array} & \begin{array}{c}T\\F\end{array} & \begin{array}{c}F\ (impossible)\\F\end{array} & F \\
\hline \\
\begin{array}{c}F\\F\end{array} & \begin{array}{c}T\\T\end{array} & \begin{array}{c}T\\F\end{array} & \begin{array}{c}F\\F\end{array} & F \\
\hline \\
\begin{array}{c}F\\F\end{array} & \begin{array}{c}F\\F\end{array} & \begin{array}{c}T\\F\end{array} & \begin{array}{c}F\\F\end{array} & F \\
\end{array}$$
The function can now look like:
A well designed function shouldn't have surprising behaviour, and therefore shouldn't have special cases. Can we get rid of
The behaviour of
Suppose we make that change — then what about the case where
Since you tagged this question as c++, you must take care of const-correctness:
cr54045.cpp:36:18: warning: conversion from string literal to 'char *' is deprecated [-Wdeprecated-writable-strings]
res = strstrImp("helloworld","low");
^
cr54045.cpp:36:31: warning: conversion from string literal to 'char *' is deprecated [-Wdeprecated-writable-strings]
res = strstrImp("helloworld","low");
^There should be a
const char *strstrImp(const char *input, const char *find)
{
return strstrImp(const_cast(input), find);
}Incorrect optimization
Try this test case, which compares your implementation against the standard
strstr().int main()
{
const char *input = "alfalfalfa romeo", *find = "alfalfa ";
printf("%s\n", strstrImp(input, find)); // (null)
printf("%s\n", strstr(input, find)); // alfalfa romeo
}Logic simplification
Since the
start = p optimization is bogus, we will only end the loop with start++. The function body would then look like:if (*find == '\0')
return input;
if (*input == '\0')
return NULL;
char *start = input;
while (*start != '\0')
{
…
start++;
}
return NULL;Next, notice that the
if (*input == '\0') return NULL special case is redundant. Also notice that you never refer to input again, so you might as well think of start and input as the same thing. The outline simplifies to:if (*find == '\0')
return input;
while (*input != '\0')
{
…
input++;
}
return NULL;Next, let's turn our attention inside the outer loop.
Since
p is only ever used inside the loop, you should pull its declaration inside the loop. p and q should both be const char *.while (*input != '\0')
{
const char *p = input, *q = find;
while (*p != 0 && *q != '\0' && *p == *q)
{
p++;
q++;
}
if (*q=='\0')
{
return input;
}
input++;
}It turns out that the inner-loop condition can be simplified, because the
*p == '\0' test is redundant:$$\begin{array}{ccc|c|c}
\text{p == q} & \text{q != '\0'} & \text{p != '\0'} & {\begin{array}{l}\text{p!='\0' &&}\\\text{q!='\0' &&}\\\text{p==q}\end{array}} & \begin{array}{l}\text{q != '\0' &&}\\\text{p == *q}\end{array} \\
\hline \\
\begin{array}{c}T\\T\end{array} & \begin{array}{c}T\\T\end{array} & \begin{array}{c}T\\F\end{array} & \begin{array}{c}T\\F\ (impossible)\end{array} & T \\
\hline \\
\begin{array}{c}T\\T\end{array} & \begin{array}{c}F\\F\end{array} & \begin{array}{c}T\\F\end{array} & \begin{array}{c}F\ (impossible)\\F\end{array} & F \\
\hline \\
\begin{array}{c}F\\F\end{array} & \begin{array}{c}T\\T\end{array} & \begin{array}{c}T\\F\end{array} & \begin{array}{c}F\\F\end{array} & F \\
\hline \\
\begin{array}{c}F\\F\end{array} & \begin{array}{c}F\\F\end{array} & \begin{array}{c}T\\F\end{array} & \begin{array}{c}F\\F\end{array} & F \\
\end{array}$$
The function can now look like:
if (*find=='\0')
return input;
while (*input != '\0')
{
const char *p, *q;
for (p = input, q = find; *q != '\0' && *p == *q; p++, q++) {}
if (*q == '\0')
{
return input;
}
input++;
}
return NULL;A well designed function shouldn't have surprising behaviour, and therefore shouldn't have special cases. Can we get rid of
if (*find == '\0') return input;?The behaviour of
strstr() is, if find is "", then the function returns input. In order to reach the return input; statement when find == "" and input == "", we have to get rid of both the if (find == '\0') special case and the while (input != '\0') guard condition.char *strstrImp(char *input, const char *find)
{
do
{
const char *p, *q;
for (p = input, q = find; *q != '\0' && *p == *q; p++, q++) {}
if (*q == '\0')
{
return input;
}
} while (*(input++) != '\0');
return NULL;
}
const char *strstrImp(const char *input, const char *find)
{
return strstrImp(const_cast(input), find);
}Suppose we make that change — then what about the case where
input == "" and find is non-empty? It turns out, that works just fine: it will flow through all of the conditions and return NULL! There you go — a compact implementation of strstr() with reasonably simple logic and without special cases.Code Snippets
cr54045.cpp:36:18: warning: conversion from string literal to 'char *' is deprecated [-Wdeprecated-writable-strings]
res = strstrImp("helloworld","low");
^
cr54045.cpp:36:31: warning: conversion from string literal to 'char *' is deprecated [-Wdeprecated-writable-strings]
res = strstrImp("helloworld","low");
^const char *strstrImp(const char *input, const char *find)
{
return strstrImp(const_cast<char *>(input), find);
}int main()
{
const char *input = "alfalfalfa romeo", *find = "alfalfa ";
printf("%s\n", strstrImp(input, find)); // (null)
printf("%s\n", strstr(input, find)); // alfalfa romeo
}if (*find == '\0')
return input;
if (*input == '\0')
return NULL;
char *start = input;
while (*start != '\0')
{
…
start++;
}
return NULL;if (*find == '\0')
return input;
while (*input != '\0')
{
…
input++;
}
return NULL;Context
StackExchange Code Review Q#54045, answer score: 19
Revisions (0)
No revisions yet.