HiveBrain v1.2.0
Get Started
← Back to all entries
patterncModerate

ToyVM - a small and simple virtual machine in C + FizzBuzz demonstration

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
simplefizzbuzzdemonstrationsmallmachineandtoyvmvirtual

Problem

(See also the next iteration.)

I have this small virtual machine.

What is there:

  • Stack and, thus, recursion.



  • Conditional and unconditional jumps and, thus, choice and iteration.



What is not there:

  • Floating-point type support.



  • Bit manipulation instructions.



  • No input operations, all "input" must be hardcoded into the memory of the VM.



toyvm.h:

```
#ifndef TOYVM_H
#define TOYVM_H

#include
#include
#include
#include

/ Arithmetics /
const uint8_t ADD = 0x01;
const uint8_t NEG = 0x02;
const uint8_t MUL = 0x03;
const uint8_t DIV = 0x04;
const uint8_t MOD = 0x05;

/ Conditionals /
const uint8_t CMP = 0x10;
const uint8_t JA = 0x11; / Jump if above. /
const uint8_t JE = 0x12; / Jump if equal. /
const uint8_t JB = 0x13; / Jump if below. /
const uint8_t JMP = 0x14; / Unconditional jump for implementing loops. /

/ Subroutines /
const uint8_t CALL = 0x20;
const uint8_t RET = 0x21;

/ Moving data /
const uint8_t LOAD = 0x30;
const uint8_t STORE = 0x31;
const uint8_t CONST = 0x32;

/ Auxiliary /
const uint8_t HALT = 0x40;
const uint8_t INT = 0x41;
const uint8_t NOP = 0x42;

/ Stack operations /
const uint8_t PUSH = 0x50;
const uint8_t PUSH_ALL = 0x51;
const uint8_t POP = 0x52;
const uint8_t POP_ALL = 0x53;
const uint8_t LSP = 0X54; / Load Stack Pointer - used for accessing the /
/ function arguments. /
/ Register indices /
const uint8_t REG1 = 0x0;
const uint8_t REG2 = 0x1;
const uint8_t REG3 = 0x2;
const uint8_t REG4 = 0x3;

/ Status codes /
const uint8_t STATUS_HALT_BAD_INSTRUCTION = 0x1;
const uint8_t STATUS_STACK_UNDERFLOW = 0x2;
const uint8_t STATUS_STACK_OVERFLOW = 0x4;
const uint8_t STATUS_INVALID_REGISTER_INDEX = 0x8;
const uint8_t STATUS_BAD_ACCESS = 0x10;
const uint8_t STATUS_COMPARISON_BELOW = 0x20;
const uint8_t STATUS_COMPARISON_EQUAL = 0x40;
const uint8_t STATUS_COMPARISON_ABOVE = 0x80;

/*

Solution

I see a number of things that may help you improve your code.

Don't use const where it is forbidden

Normally, I'd prefer to use const as you've done in this code. However, there are circumstances in C where it's not allowed and several of those instances appear in your code. The first is with the registers array in VM_CPU and the second is all of the opcode cases within the switch. See this question for why that's a problem. There are two possible fixes. One would be to use an enum:

enum { N_REGISTERS = 4 };


and the other is to use #define

#define N_REGISTERS 4


Of the two, I think I'd prefer the enum flavor for this code. It means that at least you have an integral type, even though you can't really declare it to be size_t type (or uint8_t type in the case of the opcodes).

Separate interface from implementation

The interface goes into a header file and the implementation (that is, everything that actually emits bytes including all functions and data) should be in a separate .c file. The reason is that you might have multiple source files including the .h file but only one instance of the corresponding .c file. In other words, split your existing toyvm.h file into a .h file and a .c file.

Initialize completely

The INIT_VM function does not set the status, so the effect is that the PRINT_STATUS call is relying the values of uninitialized memory. Better would be to completely initialize the virtual CPU in the same way that a real CPU is initialized by a reset signal.

Write for the typical usage

In every case in which an EXECUTE_xxx() function is executed, its return value is negated before testing. Why not invert the sense of those functions so that they return true on error? It would save some code.

Consider an alternate data representation

At the moment, information about each instruction is manifest in several different places. There is the name of it, the value of the opcode, the instruction length and the code that implements the instruction. I'd suggest gathering all of that together into a struct:

struct instruction {
    uint8_t opcode;
    char *name;
    unsigned size;
    bool (*evaluate)(TOYVM *);
};

const struct instruction instructions[] = {
    { ADD, "ADD", 3, EXECUTE_ADD },
    { NEG, "NEG", 2, EXECUTE_NEG },
    // etc.
};


Then execution of each instruction would simply be a matter of table lookup and the calling the evaluate function. You might wish to also incorporate functions for assemble and disassemble as well to make it extremely easy to write assembler, diassembler and related tools.

Reconsider naming convention

Having all of the functions all caps like INSTRUCTION_FITS_IN_MEMORY is not typical. More often that's used for #define names and functions are in mixed case. Following a more conventional convention will help others read and understand your code.

Eliminate return 0 at the end of main

Since C99, the compiler automatically generates the code corresponding to return 0 at the end of main so there is no need to explicitly write it.

Code Snippets

enum { N_REGISTERS = 4 };
#define N_REGISTERS 4
struct instruction {
    uint8_t opcode;
    char *name;
    unsigned size;
    bool (*evaluate)(TOYVM *);
};

const struct instruction instructions[] = {
    { ADD, "ADD", 3, EXECUTE_ADD },
    { NEG, "NEG", 2, EXECUTE_NEG },
    // etc.
};

Context

StackExchange Code Review Q#122385, answer score: 11

Revisions (0)

No revisions yet.