patterncMinor
K&R 1-23 Remove all C comments
Viewed 0 times
removecommentsall
Problem
From K&R exercise 1-23,
Write a program to remove all comments from a C program. Don't forget
to handle quoted strings and character constants properly. C comments
don't nest.
My solution,
Write a program to remove all comments from a C program. Don't forget
to handle quoted strings and character constants properly. C comments
don't nest.
My solution,
#include
#include
#include
#define MAXLEN 10000 // max input length
int get_input(char input[], int maxlen);
char * no_comments(char input[], int len);
_Bool comment_start(char input[], int idx);
_Bool comment_end(char input[], int idx, char *comments, int cdx);
_Bool quote_start(char input[], int idx);
_Bool quote_end(char input[], int idx);
int main(void)
{
char input[MAXLEN];
int len = get_input(input, MAXLEN);
char *scrubbed = no_comments(input, len);
printf("%s", scrubbed);
free(scrubbed);
}
int get_input(char input[], int maxlen)
{
int i;
char c;
for (i = 0; i 1 && comments[1] == '/');
_Bool old_style = (cdx == 1 && cur == '*') || (cdx > 1 && comments[1] == '*');
return (new_style && nxt == '\n') || (old_style && cdx > 2 && prev == '*' && cur == '/');
}
_Bool quote_start(char input[], int idx)
{
return input[idx] == '"' && input[idx-1] != '\'';
}
_Bool quote_end(char input[], int idx)
{
return input[idx] == '"' && input[idx-1] != '\\';
}Solution
I recommend using Valgrind as the first step of the self-code review. Here it only reports one error when compiled with
The problem lies inside
Add printing value of
Position 10 is not set. However, later on in
Input array and index are passed to
At some point you will read an element of input array that has not been initialized. It's an undefined behavior in C and it's quite dangerous.
gcc on Linux:$ gcc -O2 main.c -o main -std=c99 -Wall -Wextra -g
$ valgrind --track-origins=yes ./main
(...)
==24100== Conditional jump or move depends on uninitialised value(s)
==24100== at 0x4009B4: comment_start (main.c:85)
==24100== by 0x4009B4: no_comments (main.c:54)
==24100== by 0x40062E: main (main.c:21)
==24100== Uninitialised value was created by a stack allocation
==24100== at 0x400620: main (main.c:20)
==24100==
==24100== Conditional jump or move depends on uninitialised value(s)
==24100== at 0x4009CA: quote_start (main.c:102)
==24100== by 0x4009CA: no_comments (main.c:62)
==24100== by 0x40062E: main (main.c:21)
==24100== Uninitialised value was created by a stack allocation
==24100== at 0x400620: main (main.c:20)
==24100==
==24100== Conditional jump or move depends on uninitialised value(s)
==24100== at 0x518C1B2: vfprintf (in /lib64/libc-2.21.so)
==24100== by 0x51928E8: printf (in /lib64/libc-2.21.so)
==24100== by 0x400640: main (main.c:23)
==24100== Uninitialised value was created by a stack allocation
==24100== at 0x400620: main (main.c:20)
=The problem lies inside
get_input(), exactly in this line:input[++i] = '\0';Add printing value of
i after each iteration and after the pre-incrementation. You will see something like this:i == 0
i == 1
i == 2
i == 3
i == 4
i == 5
i == 6
i == 7
i == 8
i == 9
i == 11Position 10 is not set. However, later on in
no_comments() you have a loop:for (int idx = 0; idx <= len; idx++) {
if (strlen(comments) == 0) { // outside comment
if (!quoted && comment_start(input, idx)) {Input array and index are passed to
comment_start() and read there:return input[idx] == '/' && (nxt == '*' || nxt == '/');At some point you will read an element of input array that has not been initialized. It's an undefined behavior in C and it's quite dangerous.
Code Snippets
$ gcc -O2 main.c -o main -std=c99 -Wall -Wextra -g
$ valgrind --track-origins=yes ./main
(...)
==24100== Conditional jump or move depends on uninitialised value(s)
==24100== at 0x4009B4: comment_start (main.c:85)
==24100== by 0x4009B4: no_comments (main.c:54)
==24100== by 0x40062E: main (main.c:21)
==24100== Uninitialised value was created by a stack allocation
==24100== at 0x400620: main (main.c:20)
==24100==
==24100== Conditional jump or move depends on uninitialised value(s)
==24100== at 0x4009CA: quote_start (main.c:102)
==24100== by 0x4009CA: no_comments (main.c:62)
==24100== by 0x40062E: main (main.c:21)
==24100== Uninitialised value was created by a stack allocation
==24100== at 0x400620: main (main.c:20)
==24100==
==24100== Conditional jump or move depends on uninitialised value(s)
==24100== at 0x518C1B2: vfprintf (in /lib64/libc-2.21.so)
==24100== by 0x51928E8: printf (in /lib64/libc-2.21.so)
==24100== by 0x400640: main (main.c:23)
==24100== Uninitialised value was created by a stack allocation
==24100== at 0x400620: main (main.c:20)
=input[++i] = '\0';i == 0
i == 1
i == 2
i == 3
i == 4
i == 5
i == 6
i == 7
i == 8
i == 9
i == 11for (int idx = 0; idx <= len; idx++) {
if (strlen(comments) == 0) { // outside comment
if (!quoted && comment_start(input, idx)) {return input[idx] == '/' && (nxt == '*' || nxt == '/');Context
StackExchange Code Review Q#115073, answer score: 7
Revisions (0)
No revisions yet.