HiveBrain v1.2.0
Get Started
← Back to all entries
patternrustMinor

Simple virtual machine in Rust

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
machinesimplevirtualrust

Problem

About

I'm trying to implement a simple virtual machine. Now it supports only three operations:

  • ADD(reg0, reg1, reg2) -> reg0 = reg1 + reg2



  • LOAD(reg, value) -> reg = value



  • EXIT -> finish process



It has 8 registers and no memory (for now).

Source code

main.rs:

```
use std::process;

mod opcodes;
mod utils;

const REGISTERS_COUNT: usize = 8;

#[derive(Copy, Clone)]
struct Program {
code: &'a Vec,
pos: usize
}

impl Program {
pub fn new(code: &'a Vec) -> Program {
Program { code: code, pos: 0 }
}

pub fn fetch(&mut self) -> u8 {
self.pos += 1;
if self.pos > self.code.len() {
process::exit(0);
}
return self.code[self.pos - 1];
}
}

fn eval(mut program: Program, mut registers: [u64; REGISTERS_COUNT]) {
loop {
let operator = program.fetch();
let operands = (
program.fetch(),

vec![
program.fetch(), program.fetch(),
program.fetch(), program.fetch(),
program.fetch(), program.fetch(),
program.fetch(), program.fetch()
]
);

match operator {
opcodes::EXIT => {
process::exit(0);
}
opcodes::LOAD => {
registers[operands.0 as usize] = utils::concat_bytes(&operands.1);
}
opcodes::ADD => {
registers[operands.0 as usize] = registers[utils::concat_bytes(&operands.1[..4]) as usize]
+ registers[utils::concat_bytes(&operands.1[4..]) as usize];
}
_ => {
println!("error: invalid instruction code");
process::exit(1);
}
}

println!("{:?}", registers);
}
}

fn main() {
let registers: [u64; REGISTERS_COUNT] = [0; REGISTERS_COUNT];
let code = vec![
opcodes::LOAD,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x64,
opco

Solution

-
Document your instruction format! Looks to be opcode,
destination, 8 bytes of input. You need to describe the endianness
of the data though. Comments are an OK first step, but data
structures with self-describing names are probably even better.

-
You could avoid entire classes of errors by splitting the
program and data memory. This is "Harvard" vs "von Neumann"
architecture.

-
Remove unneeded type specification, such as when defining
registers in main.

-
Can use Default::default to get a reasonable initial state,
such as a vector of zeroes. Prevents having to re-type the length
again.

-
Don't use explicit return statements at the end of functions.

-
What is the expected behavior if you concat a slice with more
than 8 bytes? I made this limit to the first 8 bytes.

-
Become intimately familiar with the methods on Iterator and try to leverage the functional tools it provides, like fold.

-
Avoid slice indexing whenever possible. It's slightly slower as
it performs bounds checks.

-
concat is a bit odd of a name. I feel that pack is more
typical for this operation.

-
Your opcode is crying out to be an enum. This also happens to be
a great place to hang behavior, such as converting it to/from an
integer.

-
It's a little strange that eval accepts the program and
registers by value. Maybe you want to create a Machine that owns
them, or accept them by mutable reference?

-
It's strange that Program contains a pointer to where the data
is. Typically, the program should be immutable and you'd have a
Program Counter (PC) that tracks where you are in the program.

-
Printing the registers each iteration is a form of logging. If
you need such a thing, inject it using a generic. In this case, I
just accepted a generic function type, but you could also create
your own trait if you wanted more granular places to log.

-
You should basically never use a &Vec; use &[T] instead.

-
You should basically never use process:exit inside of any
function but main; use proper error handling like Option or
Result. I usually start with expect on these types which will
panic the program, then thread the errors through once the
general structure is in place.

-
Use slice::get instead of checking the length manually.

-
Creating a fetch_many method allows better naming,
flexibility, and a more efficient implementation.

-
Create more types! For example, create a RegisterFile and
Arguments to offload some of the decision making and
implementation details. These types provide naming (which is
self-documentation), extra seams for types and assertions, as well
as places to put "helper" methods (e.g. pack_bytes). They might
show places where you might be misusing sizes (e.g. indexing the
register file beyond 8-bit makes no sense).

``
use std::ops::{Index, IndexMut};
use std::cmp;
use std::error::Error;

enum Opcode {
Exit = 0x00,
Load = 0x01,
Add = 0x02,
}

impl Opcode {
fn from_u8(byte: u8) -> Result> {
use Opcode::*;

let opc = match byte {
x if x == Exit as u8 => Exit,
x if x == Load as u8 => Load,
x if x == Add as u8 => Add,
_ => return Err(From::from("invalid instruction code")),
};

Ok(opc)
}
}

#[derive(Debug)]
struct RegisterFile {
registers: [u64; 8],
}

// TODO: Consider catching and reporting access to registers outside
// of the register file. The indexing implementation panics the
// program.

impl Index for RegisterFile {
type Output = u64;

fn index(&self, idx: u8) -> &u64 {
&self.registers[idx as usize]
}
}

impl IndexMut for RegisterFile {
fn index_mut(&mut self, idx: u8) -> &mut u64 {
&mut self.registers[idx as usize]
}
}

impl Default for RegisterFile {
fn default() -> RegisterFile {
RegisterFile { registers: Default::default() }
}
}

#[derive(Copy, Clone)]
struct Program {
code: &'a [u8],
pos: usize
}

impl Program {
pub fn new(code: &'a [u8]) -> Program {
Program { code: code, pos: 0 }
}

fn fetch(&mut self) -> Result> {
let v = try!(self.code.get(self.pos).ok_or("No more code to read"));
self.pos += 1;
Ok(*v)
}

fn fetch_many(&mut self, n: usize) -> Result, Box> {
if self.code.len() ,
}

impl Arguments {
// TODO: Did I pick
high and low` right?
pub fn high_u32_8(&self) -> u8 {
self.values[3]
}

pub fn low_u32_8(&self) -> u8 {
self.values[7]
}

pub fn u64(&self) -> u64 {
Arguments::pack_bytes(&self.values)
}

fn pack_bytes(bytes: &[u8]) -> u64 {
let len = cmp::min(bytes.len(), 8);
bytes[..len].iter().fold(0, |acc, &byte| acc (mut program: Program, mut registers: RegisterFile, each_iteration: F) -> Result>
where F: Fn(&RegisterFile),
{
loop {
let operator = try!(program.fetch());
let operator = try!(Opcode::from_u8(operator))

Code Snippets

use std::ops::{Index, IndexMut};
use std::cmp;
use std::error::Error;

enum Opcode {
    Exit = 0x00,
    Load = 0x01,
    Add  = 0x02,
}

impl Opcode {
    fn from_u8(byte: u8) -> Result<Opcode, Box<Error + Send + Sync>> {
        use Opcode::*;

        let opc = match byte {
            x if x == Exit as u8 => Exit,
            x if x == Load as u8 => Load,
            x if x == Add as u8 => Add,
            _ => return Err(From::from("invalid instruction code")),
        };

        Ok(opc)
    }
}

#[derive(Debug)]
struct RegisterFile {
    registers: [u64; 8],
}

// TODO: Consider catching and reporting access to registers outside
// of the register file. The indexing implementation panics the
// program.

impl Index<u8> for RegisterFile {
    type Output = u64;

    fn index(&self, idx: u8) -> &u64 {
        &self.registers[idx as usize]
    }
}

impl IndexMut<u8> for RegisterFile {
    fn index_mut(&mut self, idx: u8) -> &mut u64 {
        &mut self.registers[idx as usize]
    }
}

impl Default for RegisterFile {
    fn default() -> RegisterFile {
        RegisterFile { registers: Default::default() }
    }
}

#[derive(Copy, Clone)]
struct Program<'a> {
    code: &'a [u8],
    pos: usize
}

impl<'a> Program<'a> {
    pub fn new(code: &'a [u8]) -> Program<'a> {
        Program { code: code, pos: 0 }
    }

    fn fetch(&mut self) -> Result<u8, Box<Error + Send + Sync>> {
        let v = try!(self.code.get(self.pos).ok_or("No more code to read"));
        self.pos += 1;
        Ok(*v)
    }

    fn fetch_many(&mut self, n: usize) -> Result<Vec<u8>, Box<Error + Send + Sync>> {
        if self.code.len() < self.pos + n {
            return Err(From::from("No more code to read"));
        }
        let v = &self.code[self.pos..(self.pos + n)];
        self.pos += n;
        Ok(v.to_owned())
    }
}

struct Arguments {
    values: Vec<u8>,
}

impl Arguments {
    // TODO: Did I pick `high` and `low` right?
    pub fn high_u32_8(&self) -> u8 {
        self.values[3]
    }

    pub fn low_u32_8(&self) -> u8 {
        self.values[7]
    }

    pub fn u64(&self) -> u64 {
        Arguments::pack_bytes(&self.values)
    }

    fn pack_bytes(bytes: &[u8]) -> u64 {
        let len = cmp::min(bytes.len(), 8);
        bytes[..len].iter().fold(0, |acc, &byte| acc << 8 | (byte as u64))
    }
}

fn eval<F>(mut program: Program, mut registers: RegisterFile, each_iteration: F) -> Result<(), Box<Error + Send + Sync>>
    where F: Fn(&RegisterFile),
{
    loop {
        let operator = try!(program.fetch());
        let operator = try!(Opcode::from_u8(operator));
        let destination = try!(program.fetch());
        let arguments = try!(program.fetch_many(8));
        let arguments = Arguments { values: arguments };

        match operator {
            Opcode::Exit => return Ok(()),
            Opcode::Load => {
                registers[destination] = arguments.u64();
            }
            Opcode::Add => {
                registers[destination] = regist

Context

StackExchange Code Review Q#128304, answer score: 5

Revisions (0)

No revisions yet.