patterncModerate
Brainf*** Interpreter in C
Viewed 0 times
interpreterbrainfstackoverflow
Problem
I posted a question on Stack Overflow regarding this program. I ended up pretty much rewriting everything to get what is posted here. This turned into a two-day long project. I've been using Python for so long; I've been spoiled. Anyhow, here is a BrainF*** interpreter. It takes in a file or reads from
... so I am fairly confident this is an accurate implementation of the language specs; except for bounds checking. I don't have bounds checking. If you find a bug I'd like to hear about it! Moreover I'd like to read some feedback regarding the quality of the code itself. There's a doubly-linked list I never really implement aside from the input text. I should really just switch out all that for a static array; or maybe switch out the program's memory array for a dynamic array. I can't decide.
bf.h
list.c
```
#include
#include "bf.h"
// allocates a new node and sets all the things to zero
node *newnode() {
node *n = malloc(sizeof(node));
n->prev = n->next = 0;
n->jump = n->val = 0;
return n;
}
// appends a node to a given node. assumes it is an end node
node append(node n) {
n->next = newnode();
n->next->prev = n;
return n->next;
}
// prepend node to list. assumes it is the first node
node prepend(node n) {
n->prev = newnode();
n->prev->next = n;
return n->prev;
}
// navigates to first node, then frees all the nodes, iterating to the end
void erase(node *n) {
node *m;
while (n->prev)
n = n->prev;
while (n->next) {
m = n->next;
free(n);
n = m;
}
}
// pops
stdin. I've tested it with:- 99bottles.bf
- echo2.bf
- fibonacci.bf
- hanoi.bf
- hellobf.bf
... so I am fairly confident this is an accurate implementation of the language specs; except for bounds checking. I don't have bounds checking. If you find a bug I'd like to hear about it! Moreover I'd like to read some feedback regarding the quality of the code itself. There's a doubly-linked list I never really implement aside from the input text. I should really just switch out all that for a static array; or maybe switch out the program's memory array for a dynamic array. I can't decide.
bf.h
// list.c
typedef struct list list;
struct node {
struct node *prev;
int val;
struct node *jump;
struct node *next;
};
typedef struct node node;
node *newnode();
node *append(node *n);
node *prepend(node *n);
void erase(node *n);
int pop(node *n);
// op.c
void doop(node *n);
node *link(node *n);list.c
```
#include
#include "bf.h"
// allocates a new node and sets all the things to zero
node *newnode() {
node *n = malloc(sizeof(node));
n->prev = n->next = 0;
n->jump = n->val = 0;
return n;
}
// appends a node to a given node. assumes it is an end node
node append(node n) {
n->next = newnode();
n->next->prev = n;
return n->next;
}
// prepend node to list. assumes it is the first node
node prepend(node n) {
n->prev = newnode();
n->prev->next = n;
return n->prev;
}
// navigates to first node, then frees all the nodes, iterating to the end
void erase(node *n) {
node *m;
while (n->prev)
n = n->prev;
while (n->next) {
m = n->next;
free(n);
n = m;
}
}
// pops
Solution
This looks really good over all. I have pretty much no knowledege of brainfuck, but it was still easy to understand the code, and nothing jumps out at me as glaringly wrong. There are however a few (mostly) minor issues.
Function names
The node-related functions could be named a bit clearer. For example, I would prefix all of them with
The names of some of your functions could be a lot clearer.
List interface
Your list interface seems a bit off to me. It should either be focused on values and have methods like
It would also be nice to have a
Global variables
Your interpretter currently depends on global state. This means that you can't have two independent brainfuck instances at the same time. This might be an acceptable limitation (for now...), but it still makes programs harder to reason about, and it can make certain types of bugs very hard to track down. It would be better if you had a data structure to encapsulate this global state, and then instead of operating on globals, your functions would just acccept this
You should check the return value of
Code consolidation
The case where a file name is provided and the case where stdin is used look very similar to me. You can probably pull that into a
Variable delcarations
You seem to be using C99 or newer, so you should be declaring your variables as closely as possible to use instead of at at the beginning of function scopes.
I would put all of the tokens in an enum:
In other words, unless you need a
Memory leak
From a quick run through valgrind, it looks like your application has a memory leak. This is quite interesting since at a (quick) glance, it appears that the proper memory management code is all there. It makes me a bit suspicious that there might be a bug hiding somewhere (of course, it could just be some incorrect memory management).
```
corbin
run flag and compiler optimizationsvolatile is a widely misused variable modifier, but you acutally have one of the text book examples of when to use it. In particular, int run should be volatile. This is because the value can change in a way that the compiler is not particularly aware of. Imagine the while (run) loop from a compiler's perspective: nothing executed in that loop ever modifies run. This means that the compiler is technically allowed to assume the value of run does not change there and optimize that loop into if (run) { while (1) {} }. Marking run volatile tells the compiler that no, it really has to load it every time it looks at it, and it can't just assume from its own analysis that it never changes.Function names
The node-related functions could be named a bit clearer. For example, I would prefix all of them with
node_: node_new, node_append, node_prepend, etc. I like to go one step farther and prefix things with a fairly unique token (maybe bf_node_ instead of just node_), but that's overkill unless you plan on actually distributing a library version of this or it growing into a large project.The names of some of your functions could be a lot clearer.
doop could be evaluate_node, and link could be link_brackets, for example.List interface
Your list interface seems a bit off to me. It should either be focused on values and have methods like
int append(node n, int value), or it should only do one thing like void append(node n, node* next); In its current state, it's a strange middle ground where you actually have the list in a non-valid state for a bit until the user sets a value on the node. This is a bit strange, and I think it's prone to user error. I'd prefer to see an interface that operates around values instead of nodes whenever possible.It would also be nice to have a
list type that encapsulates a sequence of nodes. I don't really like how erase and pop traverse from whatever node you give them. This is both confusing and inefficient. Having certain functions be able to operate on a list instead of a node would be very convenient.Global variables
Your interpretter currently depends on global state. This means that you can't have two independent brainfuck instances at the same time. This might be an acceptable limitation (for now...), but it still makes programs harder to reason about, and it can make certain types of bugs very hard to track down. It would be better if you had a data structure to encapsulate this global state, and then instead of operating on globals, your functions would just acccept this
struct brainfuck or struct brainfuck_interpretter or whatever.malloc resultYou should check the return value of
malloc. Currently you'll (probably) just get a segfault in newnode if allocation fails.Code consolidation
The case where a file name is provided and the case where stdin is used look very similar to me. You can probably pull that into a
consume_stream function that takes a FILE*, and it will make your main a lot more manageable and get rid of code duplication. Depending on how you want to handle the line-based approach compared to file based, this might be a problem though.NULL vs 0node *newnode() {
node *n = malloc(sizeof(node));
n->prev = n->next = 0;
n->jump = n->val = 0;
return n;
}NULL conveys semantic meaning whereas 0 doesn't. For this reason, 0 as a null pointer should be avoided in favor of NULL.Variable delcarations
You seem to be using C99 or newer, so you should be declaring your variables as closely as possible to use instead of at at the beginning of function scopes.
#define vs enumI would put all of the tokens in an enum:
enum brainfuck_token { LSEEK = '', ... }. An example where this could be nice is giving your list a brainfuck_token val; member instead of an int. enums also can't be re#defined which is nice, and debuggers can more easily find names for them instead of the literal you're likely to see for a #define.In other words, unless you need a
#define for some reason or other (they can be incredibly useful if you either have to use one because of limitations on other options or if you want to be able to change something at compile time without editing source code), it's typically better to go with either a constant or an enum.Memory leak
From a quick run through valgrind, it looks like your application has a memory leak. This is quite interesting since at a (quick) glance, it appears that the proper memory management code is all there. It makes me a bit suspicious that there might be a bug hiding somewhere (of course, it could just be some incorrect memory management).
```
corbin
Code Snippets
node *newnode() {
node *n = malloc(sizeof(node));
n->prev = n->next = 0;
n->jump = n->val = 0;
return n;
}corbin@lmxe ~/review $ valgrind --leak-check=full ./bf fib.bf
==2965== Memcheck, a memory error detector
==2965== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==2965== Using Valgrind-3.10.0.SVN and LibVEX; rerun with -h for copyright info
==2965== Command: ./bf fib.bf
==2965==
fib.bf
1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89==2965==
==2965== HEAP SUMMARY:
==2965== in use at exit: 64 bytes in 2 blocks
==2965== total heap usage: 556 allocs, 554 frees, 18,328 bytes allocated
==2965==
==2965== 32 bytes in 1 blocks are definitely lost in loss record 1 of 2
==2965== at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==2965== by 0x4007C6: newnode (list.c:6)
==2965== by 0x400820: append (list.c:14)
==2965== by 0x400A6A: main (main.c:27)
==2965==
==2965== 32 bytes in 1 blocks are definitely lost in loss record 2 of 2
==2965== at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==2965== by 0x4007C6: newnode (list.c:6)
==2965== by 0x400D3A: link (op.c:67)
==2965== by 0x400A92: main (main.c:32)
==2965==
==2965== LEAK SUMMARY:
==2965== definitely lost: 64 bytes in 2 blocks
==2965== indirectly lost: 0 bytes in 0 blocks
==2965== possibly lost: 0 bytes in 0 blocks
==2965== still reachable: 0 bytes in 0 blocks
==2965== suppressed: 0 bytes in 0 blocks
==2965==
==2965== For counts of detected and suppressed errors, rerun with: -v
==2965== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)Context
StackExchange Code Review Q#83458, answer score: 13
Revisions (0)
No revisions yet.