snippetcMinor
Read file line-by-line with circular buffer
Viewed 0 times
filelinereadwithcircularbuffer
Problem
This isn't too complicated but I usually code in C++ instead of C. Doing memory management and IO in C always makes me feel like I'm doing everything wrong and making a mess. This is a small sample that has both which I think came out reasonable. I don't have C people to talk to so any critiques are welcome.
The goal here is to provide a means to read lines terminated by
I'd be interested even in style critiques of my C. I'd like to understand even the idiomatic differences between C and C++.
in.h
in.c
```
#include
#include
#include
/* stddef for size_t, I don't know if stdlib or stdio are guaranteed to provide
that typedef but on my system they do /
#include
#include
#include "in.h"
#define TRUE 1
#define FALSE 0
#ifdef NDEBUG
#define INITIAL_BUFF_SIZE 4096
#else
/ smaller buffer size to reveal errors /
#define INITIAL_BUFF_SIZE 16
#endif
/ http://stackoverflow.com/a/1941337/1128289 /
#ifdef DEBUG
# define DEBUG_PRINT(x) printf x
#else
# define DEBUG_PRINT(x) do {} while (0)
#endif
/ usage: DEBUG_PRINT(("var1: %d; var2: %d; str: %s\n", var1, var2, str)); /
/ internal methods /
s
The goal here is to provide a means to read lines terminated by
\n from a FILE pointer. The code tries to be reasonably efficient -- hence the circular buffer -- and not make unnecessary copies but it doesn't delve into detailed micro-optimization and I haven't profiled it. The test uses the code to simply print lines from a file.I'd be interested even in style critiques of my C. I'd like to understand even the idiomatic differences between C and C++.
in.h
#include
/* Is it possible to forward declare just FILE? */
struct LineCircBuffer
{
FILE *f_;
char *buff_;
const char *buff_end_;
const char *data_end_; /* last byte filled from file */
const char *line_;
const char *line_end_; /* points at ending newline */
};
int line_circ_buffer_init(struct LineCircBuffer* lcb, FILE *f);
void line_circ_buffer_free(struct LineCircBuffer* lcb);
void get_line(
const char ** const ret_line,
const char ** const ret_line_end,
const struct LineCircBuffer * const lcb);
int next_line(struct LineCircBuffer * const lcb);in.c
```
#include
#include
#include
/* stddef for size_t, I don't know if stdlib or stdio are guaranteed to provide
that typedef but on my system they do /
#include
#include
#include "in.h"
#define TRUE 1
#define FALSE 0
#ifdef NDEBUG
#define INITIAL_BUFF_SIZE 4096
#else
/ smaller buffer size to reveal errors /
#define INITIAL_BUFF_SIZE 16
#endif
/ http://stackoverflow.com/a/1941337/1128289 /
#ifdef DEBUG
# define DEBUG_PRINT(x) printf x
#else
# define DEBUG_PRINT(x) do {} while (0)
#endif
/ usage: DEBUG_PRINT(("var1: %d; var2: %d; str: %s\n", var1, var2, str)); /
/ internal methods /
s
Solution
Here are some things that may help you improve your code.
Simplify
It seems to me that this is a whole lot of code to implement something that could much more easily be done using
Use
Your error return codes are nicely documented in the source code, but not in the header and not as code. Ideally, the header would contain all the information needed for a user of the code. The things that are currently missing are brief descriptions of the code (including assumptions about internal state) and the error return codes. The return codes could easily be encapsulated as an
Check for errors after
The
Eliminate
In every case, the
It doesn't make any significant difference in the generated code, but it helps the reader of the code understand by eliminating a variable that's only used once.
Use library functions to simplify your code
The large
Reconsider the interface
The structure currently has
Also, the circular buffer structure is largely independent of the source of the data, so it may be more generally useful to implement it without such a close tie to the
Simplify
It seems to me that this is a whole lot of code to implement something that could much more easily be done using
fgets. Consider this possible implementation which both allocates memory and reads a line of text from a file:#define BUFF_SIZE 16384
char *fgetsalloc(FILE *stream)
{
char *s=malloc(BUFF_SIZE);
if (s == NULL)
return s;
if (NULL == fgets(s, BUFF_SIZE, stream)) {
free(s);
return NULL;
}
return realloc(s, strlen(s));
}Use
enum for error return codesYour error return codes are nicely documented in the source code, but not in the header and not as code. Ideally, the header would contain all the information needed for a user of the code. The things that are currently missing are brief descriptions of the code (including assumptions about internal state) and the error return codes. The return codes could easily be encapsulated as an
enum.Check for errors after
freadThe
line_circ_buffer_init() function checks for errors after fread but next_line does not. Eliminate
bytes_read variableIn every case, the
bytes_read function can be eliminated by using the following line of code:lcb->data_end_ = lcb->buff_ + fread(lcb->buff_, sizeof(char),
lcb->buff_end_ - lcb->buff_, lcb->f_);It doesn't make any significant difference in the generated code, but it helps the reader of the code understand by eliminating a variable that's only used once.
Use library functions to simplify your code
The large
while loop within next_line could be mostly replaced by a single call to realloc.Reconsider the interface
The structure currently has
buff_end_ but when it's used within the code it's most often used to recover the length of the buffer. It would make more sense to simply store the length of the buffer. It would also reduce the confusion potential by having one fewer similarly named variables.Also, the circular buffer structure is largely independent of the source of the data, so it may be more generally useful to implement it without such a close tie to the
FILE structure.Code Snippets
#define BUFF_SIZE 16384
char *fgetsalloc(FILE *stream)
{
char *s=malloc(BUFF_SIZE);
if (s == NULL)
return s;
if (NULL == fgets(s, BUFF_SIZE, stream)) {
free(s);
return NULL;
}
return realloc(s, strlen(s));
}lcb->data_end_ = lcb->buff_ + fread(lcb->buff_, sizeof(char),
lcb->buff_end_ - lcb->buff_, lcb->f_);Context
StackExchange Code Review Q#75339, answer score: 3
Revisions (0)
No revisions yet.