HiveBrain v1.2.0
Get Started
← Back to all entries
patterncMinor

Delete all comments in a C program

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
programcommentsdeleteall

Problem

I am new to C. And I just want to know if my code is horrible or not. I tried to make program to delete all comments in C program. It's from "The C Programming Language Second Edition" book, Exercise 1-23.

Here is my solution :

#include 
#define MAXCHAR 1000
int c = 0;
int d = 0;
int newFileText[MAXCHAR];
int longDisable = 0;
int shortDisable = 0;
int i = 0;

//Function declarations
void printArray(int a[], int b);

void myputchar(int a[], int c);

int main(void) {
    while ((c = getchar()) != EOF) {
        if (i > MAXCHAR)i = 0;
        if (c == '/') {
            d = getchar();
            if (d == '/') shortDisable = 1;
            else if (d == '*') longDisable = 1;
        } else if (c == '*') {
            d = getchar();
            if (d == '/') {
                c = getchar();
                if (!(c > 0)) {
                    break;
                }
                longDisable = 0;
            }
        } else if (c == '\n') {
            shortDisable = 0;
        }
        if (shortDisable || longDisable);
        else {
            myputchar(newFileText, c);
            i++;
        }
        /*printf("\nShort disable: %d", shortDisable);
        printf("\nLong disable: %d", longDisable);*/
    }
    printArray(newFileText, i);
    return 0;
}

void printArray(int theArray[], int lengthSoFar) {
    for (int i = 0; i < lengthSoFar; i++) {
        printf("%c", theArray[i]);
    }
}

void myputchar(int a[], int c) {
    int i = 0;
    while (a[i] != '\0') {
        i++;
    }
    a[i] = c;
}

Solution

Bugs

When processing an expression like 10/2 or 10*2, the 2 gets discarded from the output.

It looks like you attempted to implement a circular buffer for the output, but it actually fails miserably if the output exceeds 1000 bytes. One problem is performance: myputchar() performs a linear scan of the buffer to find the end of the string, where it can write the next character. A second problem is buffer overflow: myputchar() makes no attempt to ensure that the location where it stores the next character lies within the buffer. The third problem is that when you do if (i > MAXCHAR)i = 0, you haven't made any attempt to print the buffer contents before trying to reuse the buffer.

C grammar

In C, a /* or // that appears within a "string literal" should be treated literally. It should not be interpreted as a start-of-comment marker.

Furthermore, within a //-style comment, any / or / needs to be disregarded. Similarly, within a / /-style comment, any // should be disregarded. That is, the correct output for the following C99 program…

#include 
int main() {
    puts("First " /* asdf // */ "example"
    );
    puts("Second " // /*
         "example"
         // */
    );
    puts("Third " /* "/" */ "example");
}


… should be

First example
Second example
Third example


(I also got this wrong in Rev 1 of this answer. I've corrected the bug, but this illustrates a point that C requires a stateful parser.)

Organization

The variables c, d, newFileText, longDisable, shortDisable, and i are global; they should all be local to the main function.

It is customary to put the main function last, so that you don't need to declare the functions it uses.

Input / output

You read one character at a time with getchar(), but you attempt to print the output in chunks of up to 1000 characters. Writing in chunks could be more efficient than writing a character at a time, but it is certainly more difficult to implement correctly. However, after you go through the effort of accumulating the chunks, you then proceed to print the buffer one character at a time using printf("%c", theArray[i]), which negates any performance advantage you could have gained.

Therefore, you would have been better off printing the output one character at a time to begin with, without any attempt at buffering. To do so, you could use putchar(…), which is simpler than printf("%c", …).

If you did want to efficiently write a chunk of bytes at a time, you could use the write() function instead of the loop in printArray().

Suggested solution

This solution addresses the more serious logic bugs, without attempting to handle string literals correctly.

#include 

int main(void) {
    int c, d;
    int longDisable = 0, shortDisable = 0;

    while ((c = getchar()) != EOF) {
        switch (c) {
          case '/':
          case '*':
            d = getchar();
            if (!longDisable && c == '/' && d == '/') {
                shortDisable = 1;
            } else if (!shortDisable && c == '/' && d == '*') {
                longDisable = 1;
            } else if (!shortDisable && c == '*' && d == '/') {
                longDisable = 0;
            } else if (!(shortDisable || longDisable)) {
                putchar(c);
                if (d != EOF) putchar(d);
            }
            break;

          case '\n':
            shortDisable = 0;
            /* fall through... */
          default:
            if (!(shortDisable || longDisable)) {
                putchar(c);
            }
        }
    }
}

Code Snippets

#include <stdio.h>
int main() {
    puts("First " /* asdf // */ "example"
    );
    puts("Second " // /*
         "example"
         // */
    );
    puts("Third " /* "/" */ "example");
}
#include <stdio.h>

int main(void) {
    int c, d;
    int longDisable = 0, shortDisable = 0;

    while ((c = getchar()) != EOF) {
        switch (c) {
          case '/':
          case '*':
            d = getchar();
            if (!longDisable && c == '/' && d == '/') {
                shortDisable = 1;
            } else if (!shortDisable && c == '/' && d == '*') {
                longDisable = 1;
            } else if (!shortDisable && c == '*' && d == '/') {
                longDisable = 0;
            } else if (!(shortDisable || longDisable)) {
                putchar(c);
                if (d != EOF) putchar(d);
            }
            break;

          case '\n':
            shortDisable = 0;
            /* fall through... */
          default:
            if (!(shortDisable || longDisable)) {
                putchar(c);
            }
        }
    }
}

Context

StackExchange Code Review Q#142825, answer score: 6

Revisions (0)

No revisions yet.