HiveBrain v1.2.0
Get Started
← Back to all entries
patterncMinor

C getline function

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
getlinefunctionstackoverflow

Problem

I was looking for a C function that reads lines of arbitrary length from a file. I didn't find anything absolutely portable and safe from buffer overflows, so I tried writing my own.

  • Does it look like I made any memory management mistakes?



  • Does it make sense to have this function return NULL everywhere that it does?



  • Have I made any other mistakes?



char *getaline (FILE *file)
{
    size_t buffsize = 120; /* The minimum size for a line buffer */
    size_t this_char = 0;

    if (file == NULL) {
        return NULL;
    }

    char *line = (char *) malloc(buffsize);
    char **linebuff = &line;
    if (*linebuff == NULL) {
        return NULL;
    }

    char c = getc (file);
    if (c == EOF) {
        return NULL;
    }

    while (1) {
        if (this_char + 1 >= buffsize) {
            buffsize = 2 * buffsize;

            char *next_linebuff;
            next_linebuff = (char *) realloc (*linebuff, buffsize);
            if (next_linebuff == NULL) {
                return NULL;
            }

            *linebuff = next_linebuff;
        }

        (*linebuff)[this_char] = c;
        this_char++;

        if (c == '\n' || c == EOF) {
            break;
        }

        c = getc (file);
    }

    (*linebuff)[this_char] = '\0';
    line = *linebuff;
    return line;
}

int main (int argc, const char *argv[]) {
    FILE *file = fopen(argv[1], "r"); 
    char *line = "";

    while ((line = getaline(file)) != NULL) {
        printf("%s", line);
    }

    return 0;
}

Solution

As for existing solutions, check out readline. It's a pretty common library on *nix and even has a Windows port.

As for a critique of what you posted:

char *line = (char *) malloc(buffsize);


Don't bother casting a void* to another pointer type. This is a C++ thing. In C you don't have to do it, the conversion can be done implicitly.

char c = getc (file);


This is a common portability pitfall. A char cannot represent the full range of characters and also EOF. Depending on the environment you may not be able to distinguish between EOF and byte 255, or you may not be able to detect EOF at all. Store the result of getc in an int, because that's what it returns.

char *line = (char *) malloc(buffsize);

// (snip)

char c = getc (file);
if (c == EOF) {
    return NULL;
}


If you hit that return you will leak the allocation of line.

Now, you could introduce a free inside the if block... But many C programmers (myself included) prefer to avoid early return statements, because what happens is you end up with repetitive "free all the buffers" cleanup blocks. Instead of doing an early return, I like to let the scope own the allocation, meaning I insert the free whenever something like line goes out of scope, having the same cleanup block run in a success or failure case. (This sort of imitates C++'s RAII pattern, but with C and more manual.)

if (this_char + 1 >= buffsize) {
        buffsize = 2 * buffsize;


These integer operations can overflow. Maybe that's not a huge deal (I'll admit, I myself write a lot of code that doesn't handle overflow), but something to be aware of, should you be allocating sizes around the boundaries of the type of the size variables.

next_linebuff = (char *) realloc (*linebuff, buffsize);
        if (next_linebuff == NULL) {
            return NULL;
        }

        *linebuff = next_linebuff;


Looks like you've successfully avoided the "leak on realloc failure" issue that traps many newbies. (Update: Sorry, must have suffered temporary blindness, you do leak here.) But I think this is too verbose. Why do you need to maintain linebuff as a double pointer? You already have line.

(*linebuff)[this_char] = '\0';
line = *linebuff;


Seems like you missed a potential realloc here. If the buffer is exactly full when you hit that line, you'll write past the allocation.

while ((line = getaline(file)) != NULL) {
    printf("%s", line);
}


You never call free on line here.

My improved version follows, taking most of these suggestions. I've tried to keep your code mostly the same and follow the same style.

char *getaline (FILE *file)
{
    size_t buffsize = 120; /* The minimum size for a line buffer */
    size_t this_char = 0;

    if (file == NULL) {
        return NULL;
    }

    char *line = malloc(buffsize);

    if (line == NULL) {
        return NULL;
    }

    int c;

    do {
        c = fgetc(file);

        if (this_char + 1 >= buffsize) {
            buffsize = 2 * buffsize;

            char *next_linebuff;
            next_linebuff = realloc (line, buffsize);
            if (next_linebuff == NULL) {
               free(line);
               line = NULL;
               break;
            }

            line = next_linebuff;
        }

        if (c == EOF || c == '\n') {
            c = 0;
        }

        line[this_char++] = c;
    } while (c);

    return line;
}

int main()
{
    char *p = NULL;
    while ((p = getaline(stdin)) && *p)
    {
       puts(p);
       free(p);
    }
    free(p);
    return 0;
}

Code Snippets

char *line = (char *) malloc(buffsize);
char c = getc (file);
char *line = (char *) malloc(buffsize);

// (snip)

char c = getc (file);
if (c == EOF) {
    return NULL;
}
if (this_char + 1 >= buffsize) {
        buffsize = 2 * buffsize;
next_linebuff = (char *) realloc (*linebuff, buffsize);
        if (next_linebuff == NULL) {
            return NULL;
        }

        *linebuff = next_linebuff;

Context

StackExchange Code Review Q#18100, answer score: 5

Revisions (0)

No revisions yet.