patterncMinor

Improvements/suggestions for my CPU emulator

Submitted by: @import:stackexchange-codereview·Mar 10, 2026·

Viewed 0 times

codereview simulation c stackoverflow beginner assembly

emulatorforsuggestionscpuimprovements

Problem

I am trying to emulate a basic CPU (Z80) as close as possible. It currently does not read real assembly code, but that will be implemented. If you have any views on how that could be implemented, I'd really appreciate it. Instead of reading real assembly code, I made up a simpler version with the knowledge I had at the time, so it is somewhat compressed to save space.

I am hoping to fully emulate both the RAM and the stack and as you can see I have started on that.

```
#include
#include
#include

#define INC 0b0001
#define INCREMENT 0b00010001 // INC A
#define DEC 0b0010
#define DECREMENT 0b00010010 // DEC A
#define ADD 0b0011
#define ADD_FIXED_WITH_REGISTER 0b00010011 // ADD A, 255
#define ADD_REGISTER_WITH_REGISTER 0b00110011 // ADD A,B
#define SUB 0b0100
#define SUB_FIXED_WITH_REGISTER 0b00010100 // SUB A, 255
#define SUB_REGISTER_WITH_REGISTER 0b00110100 // SUB A,B
#define LD 0b0101
#define LOAD_FIXED_TO_REGISTER 0b00010101 // LD A, 255
#define LOAD_REGISTER_TO_REGISTER 0b00110101 // LD A, B
#define LOAD_FIXED_MEMORY_TO_REGISTER 0b10010101 // LD A, (255)
#define LOAD_REGISTER_TO_REGISTER_MEMORY 0b01110101 // LD (A), B
#define LOAD_FIXED_TO_REGISTER_MEMORY 0b01010101 // LD (A), 255
#define LOAD_REGISTER_TO_FIXED_MEMORY 0b01100101 // LD (255), A
#define LOAD_FIXED_TO_FIXED_MEMORY 0b01000101 // LD (255), 255
#define LOAD_REGISTER_MEMORY_TO_REGISTER 0b10110101 // LD A, (B)
#define AND 0b0110
#define AND_A_WITH_REGISTER 0b00010110 // AND B
#define OR 0b0111
#define OR_A_WITH_REGISTER 0b00110111 // OR B
#define XOR 0b1000
#define XOR_A_WITH_REGISTER 0b00111000 // XOR B
#define JUMP 0b1001
#define JUMP_TO_ADDRESS 0b00001001
#define JUMP_USING_MEMORY 0b00111001
#define

Solution

It currently does not read real assembly code, but that will be implemented.

Real assembly code might look like this:

ld a,%11010011                 ; A=%11010011, carry=?
sla a                          ; A=%10100110, carry=1
rla                            ; A=%01001101, carry=1
rlca                           ; A=%10011010, carry=0
sra a                          ; A=%11001101, carry=0
rra                            ; A=%01100110, carry=1

To do this code you'd do it in two parts:

Write an "assembler", which parses the assembly and emits machine machine code

Write a CPU emulator as you have done (or manufacture Z80 hardware), which runs the machine code

You have the second part (the CPU emulator) already, so the new software would be an assembler.

An assembler is like a compiler, but for assembly language. Theoretically you could instead write software to parse-and-run assembly language in one step instead of two, like an "interpreter" instead of a "compiler"; but that (parsing assembly language at run-time) should be much slower to run: therefore, assembling to machine code is the usual first step.

Pseudocode for an assembler might be something like:

for each line in the assembly source input file

trim leading and trailing whitespace and comments

extract the first string (which identifies the opcode)

switch on the opcode string (for example, case "ld":)

emit (to the machine code output file) the opcode

parse the remainder of the line (for example, a,%11010011) and emit machine code to represent these operands

A helpful feature of assembly instead of machine code is that assembly contains named labels and subroutine names, for example here:

cp $80                         ; comparing the unsigned A to 128
  jr c,A_Is_Positive             ; if it is less, then jump to the label given
  neg                            ; multiplying A by -1
A_Is_Positive:                   ; after this label, A is between 0 and 128

A label will correspond to a location in memory (in the machine code).

You may need a "two-pass" assembler. For example, a statement might jump forward to a label which hasn't been defined yet. You need to assemble future statements (until you find the label and know where it is in memory), and then come back and fix (write the label's memory address to) to the operand of your earlier jump instruction.

You load the assembler's output file (machine code) as the input file to your emulator (i.e. the "ROM" variable in your program).

I am hoping to fully emulate both the RAM and the stack and as you can see I have started on that.

Woohoo: 16 bytes of RAM! "Surely, that will be enough for anyone."

unsigned byte RAM[255]; // 16 bytes of RAM

If you want more than that, you can allocate it using malloc or calloc.

RAM = malloc(1024*8);

I would love to get this more condensed and/or use less variables during the interpretation stage.

One idea is to define those variables as inline functions; for example, instead of ...

bool firstregister =    readBit(&ROM[PC], 4);

... try ...

inline bool firstregister() { return readBit(&ROM[PC], 4); }

In that case you code would like like:

case INC:
            if (firstregister()) {
                memto += firstoprand();
            }
            break;

The advantage of function instead of variables is that if the opcode doesn't use one (as, for example, the INC opcode doesn't use secondoprand) then the sace statemet for that opcode don't call that secondoprand() function and therefore doesn't spend time calculating its value.

Where are your readBit and clearBit functions defined?

Another idea is to replace your switch(opcode) statement with a jump table.

// FN_PTR is a pointer to a function, which:
// - takes no input parameters (because ROM and PC are global variables)
// - doesn't return the number of bytes used; instead it alters PC before returning
typedef void (*FN_PTR)();

// see http://www.z80.info/z80oplist.txt
FN_PTR opcode_table[] = {
    Handle_NOP,
    Handle_LD_1,
    Handle_LD_2,
    Handle_INC_BC,
    ... etc ...
};

Define your case statements as separate functions:

void Handle_INC_BC()
{
    ... code to increment the BC and PC registers ...
}

Instead of switch (opcode) you can then do:

// get the function to process this opcode
FN function = opcode_table[opcode];
// invoke the function
(*function)();

Code Snippets

ld a,%11010011                 ; A=%11010011, carry=?
sla a                          ; A=%10100110, carry=1
rla                            ; A=%01001101, carry=1
rlca                           ; A=%10011010, carry=0
sra a                          ; A=%11001101, carry=0
rra                            ; A=%01100110, carry=1

cp $80                         ; comparing the unsigned A to 128
  jr c,A_Is_Positive             ; if it is less, then jump to the label given
  neg                            ; multiplying A by -1
A_Is_Positive:                   ; after this label, A is between 0 and 128

unsigned byte RAM[255]; // 16 bytes of RAM

RAM = malloc(1024*8);

bool firstregister =    readBit(&ROM[PC], 4);

Context

StackExchange Code Review Q#39526, answer score: 4

Revisions (0)

No revisions yet.