HiveBrain v1.2.0
Get Started
← Back to all entries
patterncMinor

Splitting strings in C

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
stringssplittingstackoverflow

Problem

I have created some code to split strings. The code will split the given string by the given delimiter into an array which is inside a struct it returns.

I am using strtok to do the actual tokenizing, but in my code the original string will not be changed.

split.h

#ifndef SPLIT_H
#define SPLIT_H

#ifdef __cplusplus
extern "C" {
#endif

    struct splitString {
        char **tokens;
        unsigned total;
        unsigned size;
    };

    struct splitString *split(const char *, const char *);
    void splitFree(struct splitString **);

#ifdef __cplusplus
}
#endif

#endif /* SPLIT_H */


split.c

```
#include
#include
#include
#include "split.h"

#define DEFAULT_TOKEN_SIZE 10

static struct splitString* initSplitString() {
struct splitString *s = malloc(sizeof (struct splitString));

if (s) {
s->total = 0;
s->size = DEFAULT_TOKEN_SIZE;
s->tokens = malloc(sizeof (char ) DEFAULT_TOKEN_SIZE);

for (int i = 0; i tokens[i] = NULL;
}

return s;
}

return NULL;
}

/**
* Adjust the size of the tokens variable if necessary
* @param s
*/
static void adjustSize(struct splitString *s) {
if (s->total == s->size) {
s->size += DEFAULT_TOKEN_SIZE;

char **temp_tokens = realloc(s->tokens, sizeof (char ) s->size);

if (temp_tokens) {
s->tokens = temp_tokens;
}
}
}

static void clearAndCopy(char destination, const char source, unsigned size) {
memset(destination, 0, size);
memcpy(destination, source, size);
}

struct splitString split(const char string, const char *delimiter) {
unsigned size = strlen(string) + 1;
char tmp[size];
char *token = NULL;
struct splitString *s = initSplitString();

if (s == NULL) {
goto error;
}

clearAndCopy(tmp, string, size);

token = strtok(tmp, delimiter);

while (token != NULL) {
int bsize = strlen(token) + 1;

s->tokens[s->total] = malloc(bs

Solution

Unnecessary code

This function is puzzling to me:

static void clearAndCopy(char *destination, const char *source, unsigned size) {
    memset(destination, 0, size);
    memcpy(destination, source, size);
}


Why do you need to clear the destination buffer when you are going to copy over it immediately afterwards? All calls to this function could simply be replaced by calls to memcpy instead.

Memory leak

In split_free(), you forgot to free (*s)->tokens.

Allocation strategy

There are two things I would do differently with respect to how you allocate your tokens:

-
Instead of using realloc to dynamically resize your token array, I would prescan the input and determine how many tokens there will be, and just allocate once. This avoids a possible \$O(n^2)\$ reallocation time.

-
Instead of allocating a buffer for each token and copying the token to it, I would make a single copy of the input string, and then have s->tokens[i] be pointers to the tokens within the single copy.

Code Snippets

static void clearAndCopy(char *destination, const char *source, unsigned size) {
    memset(destination, 0, size);
    memcpy(destination, source, size);
}

Context

StackExchange Code Review Q#113765, answer score: 5

Revisions (0)

No revisions yet.