HiveBrain v1.2.0
Get Started
← Back to all entries
patterncMinor

getword that properly handles undersores, string constants, comments, or preprocessor control lines

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
handlescontrolproperlypreprocessorgetwordthatundersoresconstantscommentsstring

Problem

Our version of getword does not properly handle underscores, string constants, comments, or preprocessor control lines. Write a better version.

This is the exercise 6-1 and can be foud on K&R 2 at page 150.
http://net.pku.edu.cn/~course/cs101/2008/resource/The_C_Programming_Language.pdf

My solution:

static int isValidKeyWord(char c) {
    if(isalnum(c) || c == '_') {
        return 1;
    }
    return 0;
}

int getword(char *word, int lim) {
    int c;
    char *w = word;

    int ordinaryKeyWord = 0;
    int comment         = 0;
    int stringConstant  = 0;

    while(isspace((c = getch())))
        ;

    if(c == '#' || c == '_' || isalpha(c)) {
        ordinaryKeyWord = 1;
        *w++ = c;
    }
    else if(c == '/') {
        *w++ = c;
        c = getch();
        if(c == '*') {
            *w++ = c;
            comment = 1;
        }
        else {
            *w = '\0';
            return *--w;
        }
    }
    else if(c == '\"') {
        *w++ = c;
        stringConstant = 1;
    }
    else {
        *w++ = c;
        *w = '\0';
        return c;
    }

    for(; --lim; w++) {
        *w = getch();
        if(ordinaryKeyWord && (!isValidKeyWord(*w))) {
            ungetch(*w);
            break;
        }
        else if(stringConstant && *w == '\"') {
            w++;
            break;
        }
        else if(comment && *w == '*') {
            *++w = getch();
            if(*w == '/') {
                w++;
                break;
            }
            else {
                ungetch(*w);
                w--;
            }
        }
    }

    *w = '\0';
    return word[0];
}


There are 3 main cases:

  • case 1: comments, if the first two characters are / and . In this case the function should return when the corresponding and / are met.



  • case 2: string constants, if the first character is a ", then the function should return when the closing " is met.



  • case 3: words that begin with #, _ and letters. In this case,

Solution

I don't didn't immediately see a bug in it and you're doing a lot of things well; so some minor comments.

while(isspace((c = getch())))
    ;


This will often cause a compiler warning "possible missing or empty statement". If you use a compound statement instead that may suppress that warning:

while(isspace((c = getch()))) { }


int ordinaryKeyWord = 0;
int comment         = 0;
int stringConstant  = 0;


These are mutually exclusive, so a 3-valued enum might be better.

for(; --lim; w++)


I don't think your lim testing is strict enough: for example if lim is 8 then /this/ would overrun (write past the end of) the input buffer; even an ordinaryKeyWord will write its last '\0' past the end of the buffer.

return *--w;


That's a bit tricky. It would be clearer to return word[0]; everywhere consistently.

if(ordinaryKeyWord && (!isValidKeyWord(*w))) {
        ungetch(*w);
        break;
    }


That's compact (few lines) but could be expanded to make the logic clearer for a tired reviewer:

if(ordinaryKeyWord) {
        if (isValidKeyWord(*w)) {
            continue;
        }
        ungetch(*w);
        break;
    }
    if (stringConstant) {
        ... break or continue ...


else if(c == '/') {
    *w++ = c;
    c = getch();
    if(c == '*') {
        *w++ = c;
        comment = 1;
    }
    else {
        *w = '\0';
        return *--w;
    }


You're missing ungetch in the else case.

Is there any way to show end-of-input: pressing -D for EOF for example? If so, how does getword signal that?

I don't know an easy way to automate code which reads from the keyboard using getch and ungetch.

That's a pity because I'd like to see the automated unit tests which define how well your code works.

For example, this question includes its unit tests: 6 different tests of the function being coded. I was able to identify one or two bugs in the function, not by reading the function but by reviewing the set of unit tests, to find a condition (a set of input data) that wasn't being tested.

My boss wasn't a programmer but used to user-acceptance-test the software before shipping it: he said, not 'you get what you expect' but "You get what you INspect".

Especially when you're beginning you should learn to do unit testing, and take pride in constructing a good/complete set of test cases, which is able to detect bugs in the earlier versions of your software.

Code Snippets

while(isspace((c = getch())))
    ;
while(isspace((c = getch()))) { }
int ordinaryKeyWord = 0;
int comment         = 0;
int stringConstant  = 0;
for(; --lim; w++)
return *--w;

Context

StackExchange Code Review Q#44424, answer score: 2

Revisions (0)

No revisions yet.