HiveBrain v1.2.0
Get Started
← Back to all entries
patterncMinor

Recreating strstr in C

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
recreatingstrstrstackoverflow

Problem

I'm trying to recreate strstr in c, my code works but I think it is too long, and it needs some improvement.

#include 

static int      find_str_indx(char *str, int index, char *str_to_find)
{
    int i;
    int temp;
    int found_index;

    while (str[index] != '\0')
    {
        if (str[index] == *str_to_find)
        {
            temp = index;
            i = 0;
            found_index = index;
            while (str_to_find[i] != '\0' && str[index] != '\0')
            {
                if(str_to_find[i] != str[index])
                    found_index = -1;
                i++;
                index++;
            }
            if (found_index == -1)
                index = temp;
        }
        index++;
    }
    return (found_index);
}

char    *my_strstr(char *str, char *to_find)
{
    char    *found_str;
    int     found_index;
    int     i;

    found_str = (char*)malloc(sizeof(found_str));
    found_index = find_str_indx(str, 0, to_find);
    if (found_index != -1)
    {
        i = 0;
        while (str[found_index] != '\0')
        {
            found_str[i] = str[found_index];
            found_index++;
            i++;
        }
    }
    else
        found_str = NULL;
    return (found_str);
}


UPDATE

I have rewritten the code to get rid of allocating memory inside strstr, as advised by ratchet freak and Jerry Coffin.

char    *my_strstr(const char *haystack, const char *needle)
{
    size_t  needle_len;

    needle_len = strlen(needle);
    while (*haystack)
    {
        if (*haystack == *needle)
        {
            if (!strncmp(haystack, needle, needle_len))
                    return ((char *)haystack);
        }
        haystack++;
    }
    return (NULL);
}

Solution

This strikes me as somewhat...clumsy code that makes the job more difficult than truly necessary.

I'd break the task up into pieces, something like:

for every candidate position in the haystack
    if the needle matches at this point
        return success at this point
return failure


So that leaves us with two subordinate tasks:

  • Figure out the candidate positions



  • Check for a match starting at a specified point



The first is pretty simple: the candidate positions are the beginning of the haystack through the length of the haystack minus the length of the needle (and if the needle is larger than the haystack, it can't possibly match).

The second is even simpler: step through both strings, and if they don't match at a given point, return indicating failure. If (and only if) you reach the end of the second without a mismatch, return indicating success.

So let's write some code for that:

size_t length(char const *s) {
    size_t i;
    for (i = 0; s[i]; i++)
        ;
    return i;
}

bool match(char const *a, char const *b) {
    while (*b)
        if (*a++ != *b++)
            return false;
    return true;
}


Then we can do the function itself:

char const *find_substr(char const *haystack, char const *needle) {
    size_t len1 = length(haystack);
    size_t len2 = length(needle);

    for (size_t i = 0; i+len2<=len1; i++)
        if (match(haystack + i, needle))
            return haystack + i;
    return NULL;
}


Then we probably want a bit of test code to verify that it works to at least some degree under some circumstances:

char const *check(char const *s) {
    if (s)
        return s;
    return "[NULL POINTER]";
}

#define elements(arr) (sizeof(arr)/sizeof(arr[0]))

int main() {
    char input1[] = "This is some input for the function";

    // tests: match middle, non-match, match begin, match end, empty string
    char const *tests[] = { "for", "was", "This", "ion", "" };

    for (size_t = 0; i < elements(tests); i++)
        printf("%s\n", check(find_substr(input1, tests[i])));

    printf("%s\n", check(find_substr("the", "this is a longer string")));
}


There are certainly other ways the job could be done, but that seems to me like it's at least a reasonable way to implement what's probably the simplest (and most naive) one.

Code Snippets

for every candidate position in the haystack
    if the needle matches at this point
        return success at this point
return failure
size_t length(char const *s) {
    size_t i;
    for (i = 0; s[i]; i++)
        ;
    return i;
}

bool match(char const *a, char const *b) {
    while (*b)
        if (*a++ != *b++)
            return false;
    return true;
}
char const *find_substr(char const *haystack, char const *needle) {
    size_t len1 = length(haystack);
    size_t len2 = length(needle);

    for (size_t i = 0; i+len2<=len1; i++)
        if (match(haystack + i, needle))
            return haystack + i;
    return NULL;
}
char const *check(char const *s) {
    if (s)
        return s;
    return "[NULL POINTER]";
}

#define elements(arr) (sizeof(arr)/sizeof(arr[0]))

int main() {
    char input1[] = "This is some input for the function";

    // tests: match middle, non-match, match begin, match end, empty string
    char const *tests[] = { "for", "was", "This", "ion", "" };

    for (size_t = 0; i < elements(tests); i++)
        printf("%s\n", check(find_substr(input1, tests[i])));

    printf("%s\n", check(find_substr("the", "this is a longer string")));
}

Context

StackExchange Code Review Q#128069, answer score: 7

Revisions (0)

No revisions yet.