HiveBrain v1.2.0
Get Started
← Back to all entries
patterncMinor

Brainfuck compiler with tcc backend

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
backendtccwithcompilerbrainfuck

Problem

I wrote simple compiler for Brainfuck with tcc backend (dont ask for assembler/assembly code generator!). It's output runs incredibly slow. Even Hello World program takes a LOOONG while to execute. I have added some optimizations, but it didnt help. This is code i actually made; It actually doesn't hang, but output work VERY slow (i need to wait ~4 minutes to Hello World display...):

```
#include
#include
#include
#include "libtcc.h"

int minus(FILE* f){
int c;
int amount = 0;
while ((c = getc(f)) == '-')
amount++;
ungetc(c, f);
return amount;
}

int plus(FILE* f){
int c;
int amount = 0;
while ((c = getc(f)) == '+')
amount++;
ungetc(c, f);
return amount;
}

int main(int argc, char** argv){
int pos = 0;
int brackets = 0;
if(argc!=3){
printf("Brainfuck Compiler by Joel.\n");
printf("Cannot open input file.");
exit(0);
}
int amount = 1;
FILE* f = fopen(argv[1], "r");
printf("Brainfuck Compiler by Joel. All rights reserved.\n");
if(f!=NULL){
char outbuffer = (char)malloc(1024*1024);
strcpy(outbuffer, "int getch(void);\nvoid putch(int);\nint mem[65537];\nint main(){\n");
while(!feof(f)){
char c = fgetc(f);
if(c == ''){
if(pos==65536)pos=0;
else pos++;
}
if(c == '['){
brackets++;
sprintf(outbuffer, "%swhile(mem[%d]){",outbuffer,pos);
}
if(c == ']'){
brackets--;
sprintf(outbuffer, "%s}", outbuffer);
}
if(c == '+'){
amount = plus(f)+1;
sprintf(outbuffer, "%smem[%d]+=%d;",outbuffer, pos, amount);
}
if(c == '-'){
amount = minus(f)+1;
sprintf(outbuffer, "%smem[%d]-=%d;",outbuffer, pos, amount);
}
if(c == '.'){
sprintf(outbu

Solution

First off, your code does not compile, because (at least my just installed) libtcc does not contain a compile function.

As far as I can see it, your code is that slow because you emit a getch() and a putch() when you run across a . (as a side note: according to MSDN, the correct signature for the latter is int putch(int) and both getch() and putch() are deprecated. For portability, you should use int getchar(void) and int putchar(int) anyway).

You must always check the return value of malloc calls, because an allocation request can fail.

You should use a switch instead of multiple if statements for handling the characters, like in the following:

switch (c) {
case '':
    /* Do work */
    break;
case '[':
    /* Do work */
    break;
case ']':
    /* Do work */
    break;
case '+':
    /* Do work */
    break;
case '-':
    /* Do work */
    break;
case '.':
    /* Do work */
    break;
case ',':
    /* Do work */
    break;
default:
    /* Comment */
    break;
}


The advantages are that the intention is more obvious, the compiler may be able to generate better code and it prints an error if you try to handle a character twice.

The main function should always return a value and that value should allow the user to distinguish between a successful compiler run and an error. So you should add a return 0; to the end of the function and add a return 1; (or whatever nonzero value you like) at the end of every error path.

You should always close files you have opened.

Error messages should be printed to stderr, not stdout.

For protection against buffer overflows, you can use snprintf (which is available since C99) instead of sprintf. It returns either a negative value (if an error occurred) or the number of written bytes (excluding the \0 terminator), but it does not write more than size bytes (including the \0 terminator) to the output. Thus you can use

int ret = snprintf(outbuffer, 1024*1024, format, ...);
if (ret = 1024*1024) {
    /* Error or overflow */
    free(outbuffer);
    return 1;
}


Another thing you may want to consider is to reduce the memory size by one byte (thus every allowed pos value uses only 16 bits) and then replace

if (pos == 0) pos = 65536;
else pos--;


by

pos = (pos - 1) & 0xFFFF;


and

if (pos == 65536) pos = 0;
else pos++;


by

pos = (pos + 1) & 0xFFFF;


and thus make those two operations branchless. The effect of those two statements is that the operation is carried out on the pos variable, but only the lower 16 bits of the result are used (that is, 65535+1 becomes zero and 0-1 becomes 65535 (*)).

Putting all this together, you would have (including a compile function that can indicate an error and ):

```
#include
#include
#include
#include

static int compile(char const out, char const buf) {
TCCState *tcc = tcc_new();
if (tcc_set_output_type(tcc, TCC_OUTPUT_EXE) ", argv[0]);
return 1;
}
int amount = 0;
int ret;
FILE* f = fopen(argv[1], "r");
if(f!=NULL){
char outbuffer = (char)malloc(buflen);
if (outbuffer == NULL) {
fprintf(stderr, "Internal compiler error.\n");
fclose(f);
return 1;
}
ret = snprintf(outbuffer, buflen, "int getchar(void);\nint putchar(int);\nint mem[65536] = {0};\nint main(){\n");
if (ret = buflen) {
fprintf(stderr, "Internal compiler error.\n");
free(outbuffer);
fclose(f);
return 1;
}
while(!feof(f)){
switch (fgetc(f)) {
case '':
pos = (pos + 1) & 0xFFFF;
break;
case '[':
++brackets;
ret = snprintf(outbuffer, buflen, "%swhile(mem[%d]){", outbuffer, pos);
if (ret = buflen) {
fprintf(stderr, "Internal compiler error.\n");
free(outbuffer);
fclose(f);
return 1;
}
break;
case ']':
--brackets;
ret = snprintf(outbuffer, buflen, "%s}", outbuffer);
if (ret = buflen) {
fprintf(stderr, "Internal compiler error.\n");
free(outbuffer);
fclose(f);
return 1;
}
break;
case '+':
amount = plus(f)+1;
ret = snprintf(outbuffer, buflen, "%smem[%d]+=%d;", outbuffer, pos, amount);
if (ret = buflen) {
fprintf(stderr, "Internal compiler error.\n");
free(outbuffer);
fclose(f);
return 1;
}
break;
case '-':
amount = minus(f)+1;
ret = snprintf(outbuffer, buflen, "%smem[%d]-=%d;", ou

Code Snippets

switch (c) {
case '<':
    /* Do work */
    break;
case '>':
    /* Do work */
    break;
case '[':
    /* Do work */
    break;
case ']':
    /* Do work */
    break;
case '+':
    /* Do work */
    break;
case '-':
    /* Do work */
    break;
case '.':
    /* Do work */
    break;
case ',':
    /* Do work */
    break;
default:
    /* Comment */
    break;
}
int ret = snprintf(outbuffer, 1024*1024, format, ...);
if (ret < 0 || ret >= 1024*1024) {
    /* Error or overflow */
    free(outbuffer);
    return 1;
}
if (pos == 0) pos = 65536;
else pos--;
pos = (pos - 1) & 0xFFFF;
if (pos == 65536) pos = 0;
else pos++;

Context

StackExchange Code Review Q#145658, answer score: 7

Revisions (0)

No revisions yet.