patternpythonMinor
Nand2Tetris Code Module
Viewed 0 times
codenand2tetrismodule
Problem
I'm working through the Assembler assignment in the Nand2Tetris course (chapter 6). The suggested implementation contains 4 modules: Main, Parser, Code, SyntaxTree. Today I wanted to get some feedback on my implementation of the Code module.
The book specifies a general API in order to allow students to use whichever programming language they want; I choose Python 2.7. The Code module takes a line of a simplified dialect of Assembly code called Hack and produces binary.
The Code module only deals with a single class of instruction in Hack language: The C-Instruction. This instruction takes the form
The resulting opcode takes the form:
Please review on style, implementation (especially those terribly ugly lookup tables), etc.
```
#!/usr/bin/python
# hack code translator
# Sample conversion:
# HACK Instruction: M = 1
# 111
# a = 0
# cccccc = 1111 11
# ddd = 001
# jjj = 000
# BINARY Opcode: 1110 1111 1100 1000
jumpTable = { 'null': '000',
'JGT' : '001',
'JEQ' : '010',
'JEQ' : '010',
'JGE' : '011',
'JLT' : '100',
'JNE' : '101',
'JLE' : '110',
'JMP' : '111' }
compTable = { '0' : '101010',
'1' : '111111',
'-1' : '111010',
'D' : '001100',
'A' : '110000',
'!D' : '001101',
'!A' : '11000
The book specifies a general API in order to allow students to use whichever programming language they want; I choose Python 2.7. The Code module takes a line of a simplified dialect of Assembly code called Hack and produces binary.
The Code module only deals with a single class of instruction in Hack language: The C-Instruction. This instruction takes the form
dest=comp;jump, where either dest= OR ;jump may be optionally omitted. comp is one of 28 valid expressions. dest is the destination to store the result of the computation: it can be any permutation of the three registers available (A, D, M). jump is a condition which, if true, moves control flow the address in instruction memory currently stored in register A. The resulting opcode takes the form:
111a cccc ccdd djjj, where a and cccc cc correspond to the comp, ddd corresponds to the dest, and jjj corresponds to the jump. For more details, see Chapters 4 and 6 of the textbook.Please review on style, implementation (especially those terribly ugly lookup tables), etc.
```
#!/usr/bin/python
# hack code translator
# Sample conversion:
# HACK Instruction: M = 1
# 111
# a = 0
# cccccc = 1111 11
# ddd = 001
# jjj = 000
# BINARY Opcode: 1110 1111 1100 1000
jumpTable = { 'null': '000',
'JGT' : '001',
'JEQ' : '010',
'JEQ' : '010',
'JGE' : '011',
'JLT' : '100',
'JNE' : '101',
'JLE' : '110',
'JMP' : '111' }
compTable = { '0' : '101010',
'1' : '111111',
'-1' : '111010',
'D' : '001100',
'A' : '110000',
'!D' : '001101',
'!A' : '11000
Solution
-
The values in your lookup tables are strings of binary digits, for example
With numeric values for the instruction fields, you'll need to use bitwise operations instead:
-
There's no error handling. It's likely that assembly programs will contain mistakes, for example:
The user would like to get an error message, preferably one that gives the file and line number and an explanation of what went wrong:
But as far I can see from the code you posted, what will happen here is that Python will raise a
(Possibly there's error handling in the rest of the code that you didn't show us, but I can only comment on what I see here.)
-
It's even worse for the dest field: this doesn't seem to produce any errors at all, so if I write:
then the assembler goes ahead as if I had written
and I never get told about my mistake.
-
The logic for lookup of the comp field is needlessly complex. It would be simplest just to make a 7-bit lookup table (including the a bit). Trying to modify the instruction wastes time, and it makes it harder to produce good error messages (outputting the modified instruction will confuse the user, so you have to hang on to the unmodified instruction for use in error messages).
If you don't want to write out the whole 7-bit lookup table, then write code to generate it:
and run this once, when the assembler starts up.
-
The comment for
but this is longer than writing it out in full:
-
There's no need to take a copy here:
Just write:
The values in your lookup tables are strings of binary digits, for example
'111010'. It would be closer to the way that real assemblers work if you represented these as numbers in binary, for example 0b111010. I presume you use string concatenation to assemble the instructions at present:instruction = '111' + comp + dest + jumpWith numeric values for the instruction fields, you'll need to use bitwise operations instead:
instruction = 0b111 << 13 | comp << 6 | dest << 3 | jump-
There's no error handling. It's likely that assembly programs will contain mistakes, for example:
M=-1;JGR // oops: typo for JGT
The user would like to get an error message, preferably one that gives the file and line number and an explanation of what went wrong:
prog.asm(17): bad jump field "JGR"
But as far I can see from the code you posted, what will happen here is that Python will raise a
KeyError in jump, which is much less useful.(Possibly there's error handling in the rest of the code that you didn't show us, but I can only comment on what I see here.)
-
It's even worse for the dest field: this doesn't seem to produce any errors at all, so if I write:
BAD=1;JEQ // Oops: typo for MAD
then the assembler goes ahead as if I had written
AD=1;JEQ
and I never get told about my mistake.
-
The logic for lookup of the comp field is needlessly complex. It would be simplest just to make a 7-bit lookup table (including the a bit). Trying to modify the instruction wastes time, and it makes it harder to produce good error messages (outputting the modified instruction will confuse the user, so you have to hang on to the unmodified instruction for use in error messages).
If you don't want to write out the whole 7-bit lookup table, then write code to generate it:
fullCompTable = compTable.copy()
for k, v in compTable.items():
if 'A' in k:
fullCompTable[k.replace('A', 'M')] = 1 << 6 | vand run this once, when the assembler starts up.
-
The comment for
dest says "No point in a lookup table" but that's not true. A lookup table will be faster than the computation in dest. And it only needs 15 entries. Again, if you don't want to write it out by hand, you could compute it:from itertools import permutations
destTable = {
''.join('MDA'[i] for i in p): sum(1<<i for i in p)
for n in range(1, 4)
for p in permutations(range(3), n)
}but this is longer than writing it out in full:
destTable = {
'A': 4, 'AD': 6, 'ADM': 7, 'AM': 5, 'AMD': 7, 'D': 2, 'DA': 6, 'DAM': 7,
'DM': 3, 'DMA': 7, 'M': 1, 'MA': 5, 'MAD': 7, 'MD': 3, 'MDA': 7
}-
There's no need to take a copy here:
nStr = srcStr[:] # non-destructive copyJust write:
nStr = srcStrCode Snippets
instruction = '111' + comp + dest + jumpinstruction = 0b111 << 13 | comp << 6 | dest << 3 | jumpfullCompTable = compTable.copy()
for k, v in compTable.items():
if 'A' in k:
fullCompTable[k.replace('A', 'M')] = 1 << 6 | vfrom itertools import permutations
destTable = {
''.join('MDA'[i] for i in p): sum(1<<i for i in p)
for n in range(1, 4)
for p in permutations(range(3), n)
}destTable = {
'A': 4, 'AD': 6, 'ADM': 7, 'AM': 5, 'AMD': 7, 'D': 2, 'DA': 6, 'DAM': 7,
'DM': 3, 'DMA': 7, 'M': 1, 'MA': 5, 'MAD': 7, 'MD': 3, 'MDA': 7
}Context
StackExchange Code Review Q#92140, answer score: 4
Revisions (0)
No revisions yet.