HiveBrain v1.2.0
Get Started
← Back to all entries
patterncModerate

Simple C Implementation for Unix "tail" Command

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
simpleunixtailforimplementationcommand

Problem

The following is like the Unix "tail" program. It was assigned as an exercise in Chapter 5 of Kernighan & Ritchie's The C Programming Language. Because I've only read through most of Chapter 5, I'm still unfamiliar with certain topics, such as malloc(), which may have been more appropriate to use, I don't know.

I've done a little bit of programming before, but not enough to consider myself very experienced, so any kind of advice is welcome. : )

```
#include
#include

#define DEFLINES 10
#define MAXBUFF 20000

int findTail(char *lines[][2], int nlines, char buff[], int maxbuff);

/* main() processes optional cli argument '-n', where n is a number of lines.
* The default is 10. findTail finds the last n lines from the input so that
they can be printed. /

main(int argc, char *argv[])
{
int nlines; char *endptr;

endptr = NULL;
nlines = DEFLINES;
if (argc > 2) {
printf("error: too many arguments.\n");
return EXIT_FAILURE;
}
else if (argc == 2) {
if (*argv[1] == '-') {
nlines = strtol(argv[1] + 1, &endptr, 10);
if (*endptr != '\0') {
printf("error: not a number of lines: %s\n", argv[1] + 1);
return EXIT_FAILURE;
}
}
else {
printf("error: malformed argument: %s\n", argv[1]);
return EXIT_FAILURE;
}
}

int i;
char *lines[nlines][2], buff[MAXBUFF];

findTail(lines, nlines, buff, MAXBUFF);
for (i=0; i = maxbuff - 1)
buffp = buff;
linestart = buffp;
}

}
// this is in case the input ended without a newline.
if (c == EOF && buffp != buff && buffp[-1] != '\0') {
testForRoom(lines, nlines - nfound, buffp);
*buffp = '\0';
lines[nlines - 1][wrap] = linestart;
}

}

/* shift is used upon finding a character that starts a new line. It shifts
* line pointers in the pointer array to the left, making room for new line
pointer(s) and forgetting the pointer(s) for the oldest line in memory. /

void shift(char *lines[][2], int nlines)
{
int i;

Solution

You are using some C99 features, such as declaring variables part way through a block of code and VLAs, but not obeying some C99 constraints, such as ensuring that the main() function has an explicit return type of int.

Because you're using C99, you are allowed to leave off the return(0); (or return 0;) from the end of main(). I think that was one of the less good decisions in C++ that was then echoed in C, and don't take that liberty myself; but I can't criticize your code when the standard allows it.

It might be better to use enum instead of #define for the constants; enum makes debugging easier because the values are in the symbol table, whereas #define constants are typically not.

enum { DEFLINES =    10 };
enum { MAXBUFF  = 20000 };


Your design only reads from standard input. That's not bad, though it is lightly limiting.

Your code includes:

printf(lines[i][0]);


This is a very dangerous way to use printf() - it is the ultimate format string vulnerability. The trouble is that if my input to you contains:

%s%n%13$s


then printf() is going to be reading and writing values on the stack which you didn't put there, which leads to great unhappiness. At minimum, use:

printf("%s", lines[i][0]);


Alternatively, use:

fputs(lines[i][0], stdout);


(Do not use puts() because it adds newlines to the end of your data - unless you remove the newlines from the input.)

When I compile your code using my default options, I get two warnings about the printf() - confirmation of what I'd already observed - plus a warning that findTail() does not return a value even though it is declared to return an int. That's best fixed by making it into a void function.

/usr/bin/gcc -g -std=c99 -Wall -Wextra -Wmissing-prototypes \
      -Wstrict-prototypes  cr.c -o cr


That's a pretty stringent set of warning options, and your code is good to generate just those three. I wish all the code I dealt with was as clean.

When I run the program on its own source code, it works fine. When I run it with:

$ perl -e 'for $i (1..12) { print "A" x 2047, "\n"; }' | ./cr
error: not enough room in buffer.
$


That's OK - you did say you weren't using malloc() to do dynamic memory allocation. As a general guideline, though, your error messages should include the name of the program, as found in argv[0], so that if there are multiple processes in a pipeline, for instance, you can tell which of the processes generated the error. I do this by using a function call err_setarg0(argv[0]); at the start of main(). This records the program name for use in subsequent error messages. I then use function calls such as err_error("error: not enough room in the buffer\n"); to report the messages. A minimal implementation of these two functions is:

#include 
#include 

static const char *err_arg0 = "unknown";

void err_setarg0(const char *arg0)
{
    err_arg0 = arg0;
}

void err_error(const char *fmt, ...)
{
    va_list args;
    fprintf(stderr, "%s: ", err_arg0);
    va_start(args, fmt);
    vfprintf(stderr, fmt, args);
    va_end(args);
    exit(1);
}


I have a fairly complex file with many variations on this theme that provides simple-to-use error reporting, including variations reporting on the system error (via errno) and including time stamps and process ID and ... all selectable based on program design.

Closer scrutiny shows that at any one time, at most one line will be wrapped - it means that the line starts just before the end of the buffer and has to wrap around to the start of the buffer.

Overall, a pretty good program. Well done (and I don't say that lightly).

I'm not sure how easily it will adapt to handle dynamic memory allocation - my suspicion is that you will end up with a rather different scheme for managing memory. This would remove the issues with lots of very long lines at the end of the file overflowing your buffer. Depending on your platform, you might be able to use the POSIX getline() function which will allocate memory for you as it reads lines. Or you might decide to write an emulation of that code. You then only need a simple circular buffer of character pointers to keep the last N lines, and you discard the old one (with free()) before storing the next line.

Code Snippets

enum { DEFLINES =    10 };
enum { MAXBUFF  = 20000 };
printf(lines[i][0]);
printf("%s", lines[i][0]);
fputs(lines[i][0], stdout);
/usr/bin/gcc -g -std=c99 -Wall -Wextra -Wmissing-prototypes \
      -Wstrict-prototypes  cr.c -o cr

Context

StackExchange Code Review Q#790, answer score: 13

Revisions (0)

No revisions yet.