HiveBrain v1.2.0
Get Started
← Back to all entries
patterncMinor

StringBuilder in C v2: Not a charged dance

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
stringbuilderchargednotdance

Problem

A follow-up on this. To quote by copy/paste:


Lots of languages have something that lets you build up a dynamically-sized string with minimal overhead. C doesn't, and I found myself using code that did that manually in a couple of places, so I packaged it into a class. Note that I've only implemented functionality I'm using.

stringbuilder.h

It's the same as last time.

#ifndef CONCATEN_STRINGBUILDER_H
#define CONCATEN_STRINGBUILDER_H
#include 
#include 

struct StringBuilder;
typedef struct StringBuilder *StringBuilder;
StringBuilder sb_new(size_t);
bool sb_append(StringBuilder, char);
char *sb_as_string(StringBuilder);
char *sb_free_copy(StringBuilder);
size_t sb_size(StringBuilder);
void sb_free(StringBuilder);

#endif //CONCATEN_STRINGBUILDER_H


stringbuilder.c

```
#include
#include
#include "stringbuilder.h"

struct StringBuilder {
char *mem;
size_t count;
size_t cap;
};
StringBuilder sb_new(size_t init_cap) {
if (init_cap == 0) {
return NULL;
}
StringBuilder ret = malloc(sizeof(struct StringBuilder));
if (!ret) return NULL;
ret->mem = calloc(init_cap, sizeof(char));
if (!ret->mem) {
free(ret);
return NULL;
}
ret->cap = init_cap;
ret->count = 0;
return ret;
}
#define LOAD_FACTOR 2
bool sb_append(StringBuilder to, char c) {
if (to->count + 1 == to->cap) {
char new_mem = realloc(to->mem, to->cap LOAD_FACTOR);
if (!new_mem) {
return false;
}
memset(new_mem + to->cap, 0, to->cap);
to->mem = new_mem;
to->cap *= LOAD_FACTOR;
}
++to->count;
to->mem[to->count - 1] = c;
return true;
}
char *sb_as_string(StringBuilder sb) {
return sb->mem;
}
char *sb_free_copy(StringBuilder sb) {
char ret = malloc((sb->count + 1) sizeof(char));
if (!ret) {
return NULL;
}
strcpy(ret, sb->mem);
ret[sb->count] = 0;
sb_free(sb);
return ret;
}
size_t sb_size(StringBuilder sb)

Solution

The main reason why C don't have a system like that is because that produce some overhead. But I suppose you know this and want the less overhead possible and this system.

The two following lines are not C idiomatic:

struct StringBuilder;
typedef struct StringBuilder *StringBuilder;


Of course, they are two side about this, I will give you my vision. You should not hide the implementation of structure in C. Why? Because C is design to trust the user, if your user want to "play" with his data, you can't prevent it. C allow user to hack easily and this could improve the use of your library. For example, if your library don't have a code functionality the user could code it.

The second problem is that you hide the pointer in the typedef. You should not hide pointer, this made the code hard to read. And the user could think that this is not a pointer and try to write StringBuilder *foo;. A answer about this is available on stackoverflow.

For the following, I will assume that the user know that you hide the pointer and I will use explicit struct StringBuilder in the code.

Next I will talk about your function sb_new(), first I think you should not allocate the structure StringBuilder, this made a lot of overhead. You should separate your function into two sb_new() and sb_init(), one if the user want allocate the structure and one if the user take care of it.

Plus, you use calloc(), this add a lot of useless overhead. You should use malloc(). If the user want to affect the string by zero you should implement an other function for it, like sb_unused_zero().

And, if the user want a zero size string you send NULL, I don't see the point. Maybe you should use this value for tell to the function to use a default value like 42.

A detail about your call of malloc(), you could write struct StringBuilder ret = malloc(sizeof ret);, this is more readable and you can read this answer about it. You don't need to use sizeof char because by definition the is equal to 1.

Example:

struct StringBuilder *sb_new(size_t init_cap) {
    struct StringBuilder *sb = malloc(sizeof *sb);
    if (!sb) return NULL;
    if (!sb_init(sb, init_cap)) {
        free(sb);
        return NULL;
    }
    return sb;
}

bool sb_init(struct StringBuilder *sb, size_t init_cap) {
    init_cap = init_cap == 0 ? 42 : init_cap;
    sb->mem = malloc(init_cap);
    if (!sb->mem) return false;
    sb->count = 0;
    sb->cap = init_cap;
    return true;
}

void sb_unused(struct StringBuilder *sb, char c) {
    memset(sb->mem + to->count, c, sb->cap - sb->count);
}


This allow the user to do the following:

struct foo {
    struct StringBuilder sb;
};

int main(void) {
    struct foo foo;
    sb_init(&foo.sb, 0);
}


Let talks about sb_append(), you don't check the potential overflow when you multiply by your LOAD_FACTOR, so you have an undefined behavior.

In this function, you affect new allocate byte to zero, this is again a big overhead that is useless in most of use cast. Again, the user should use the sb_unused() function that affect unused byte in your string. Plus, you suppose that your LOAD_FACTOR is equal to 2 in your use of memset().

bool sb_append(StringBuilder sb, char c) {
    if (sb->count + 1 == sb->cap) {
        if (SIZE_MAX / LOAD_FACTOR cap) return false; // this could be improve
        size_t new_cap = sb->cap * LOAD_FACTOR;
        char *new_mem = realloc(sb->mem, new_cap);
        if (!new_mem) return false;
        sb->mem = new_mem;
        sb->cap = new_cap;
    }
    sb->mem[to->count++] = c;
    return true;
}


Your next function sb_as_string() suppose that mem is already a valid c-string, maybe you should verify that your string contain a nil terminate byte.

I almost finish, the function sb_free_copy() has a strange name, maybe rename it to something like sb_into_string(). And your implementation is strange you add yourself the nil terminate byte contrary to the function sb_as_string(), more strcpy() already do it. Whatever, you should use memcpy() because you already know the good size to copy this would be faster.

Just a detail in sb_free() you verify that sb->mem is not NULL but free() handle this, so this is useless and add overhead.

Finally, maybe your structure should be rename String because I think the user could use it to keep dynamic string and not to transform it into standard c-string?

The last thing is about your use about bool, this limit the detail about the error. Maybe you could replace it by an enum of just an int that will give you more flexibility about error report.

Code Snippets

struct StringBuilder;
typedef struct StringBuilder *StringBuilder;
struct StringBuilder *sb_new(size_t init_cap) {
    struct StringBuilder *sb = malloc(sizeof *sb);
    if (!sb) return NULL;
    if (!sb_init(sb, init_cap)) {
        free(sb);
        return NULL;
    }
    return sb;
}

bool sb_init(struct StringBuilder *sb, size_t init_cap) {
    init_cap = init_cap == 0 ? 42 : init_cap;
    sb->mem = malloc(init_cap);
    if (!sb->mem) return false;
    sb->count = 0;
    sb->cap = init_cap;
    return true;
}

void sb_unused(struct StringBuilder *sb, char c) {
    memset(sb->mem + to->count, c, sb->cap - sb->count);
}
struct foo {
    struct StringBuilder sb;
};

int main(void) {
    struct foo foo;
    sb_init(&foo.sb, 0);
}
bool sb_append(StringBuilder sb, char c) {
    if (sb->count + 1 == sb->cap) {
        if (SIZE_MAX / LOAD_FACTOR < sb->cap) return false; // this could be improve
        size_t new_cap = sb->cap * LOAD_FACTOR;
        char *new_mem = realloc(sb->mem, new_cap);
        if (!new_mem) return false;
        sb->mem = new_mem;
        sb->cap = new_cap;
    }
    sb->mem[to->count++] = c;
    return true;
}

Context

StackExchange Code Review Q#155978, answer score: 7

Revisions (0)

No revisions yet.