patterncMajor
Taking arbitrary length input in C
Viewed 0 times
inputarbitrarylengthtaking
Problem
Instead of having to use something like
My questions about this code:
char buf[100500] and hope that no possible user input could be longer than 100500 bytes, I decided to make the following function:char* input (FILE* in, size_t size)
{
char * input_str = NULL;
int c;
size_t len = 0;
/*initial allocation*/
input_str = malloc(size);
if (!input_str) return NULL;
while ((c = fgetc(in)) != EOF && c != '\n')
{
input_str[len++] = c;
/*allocate more room if needed*/
if (len == size)
{
input_str = realloc(input_str, size += 16);
if (!input_str) return NULL;
}
}
input_str[len++] = 0;
return realloc(input_str, len);
}My questions about this code:
- Is leaving it up to the calling function to
freememory a bad idea?
- Is there a way to increase performance for this (memory pool, etc)? I don't like making so many syscalls...
- In general, is there a better way to go about this?
Solution
A simple improvement to improve performance by reducing the number of reallocs is to start with a decently sized buffer and when it fills up, grow it by a multiple of the size. For example, you could double the size when it fills up.
Of course this approach is less efficient memory wise, but in order to mitigate that, we can multiply by a smaller multiple (like 1.5x) and/or start with a smaller base size. If you don't care much about memory, you can up the multiple and up the base amount.
For truly enormous files, you will want to chunk your data and process it by chunk instead trying to load it into memory all at once. Another possibility is to just process it as it comes in via an on-line algorithm.
On a side note, it seems like you are reimplementing
Of course this approach is less efficient memory wise, but in order to mitigate that, we can multiply by a smaller multiple (like 1.5x) and/or start with a smaller base size. If you don't care much about memory, you can up the multiple and up the base amount.
For truly enormous files, you will want to chunk your data and process it by chunk instead trying to load it into memory all at once. Another possibility is to just process it as it comes in via an on-line algorithm.
On a side note, it seems like you are reimplementing
getline()'s functionality, so unless you need some special behavior, I'd recommend using that instead.Context
StackExchange Code Review Q#135813, answer score: 21
Revisions (0)
No revisions yet.