patternMinor
Linux NASM assembly linked list implementation
Viewed 0 times
implementationassemblylinuxlistlinkednasm
Problem
Inspired by all of the lovely linked lists lately, I decided to implement one in assembly language. This code maintains two linked lists - one is a free store and the other is the active linked list. Rather than allocating memory whenever a new node is needed, the Node is pulled from the free list and all of the pointers updated. The data is then copied into the newly "allocated" Node.
This program includes a test driver which reads a series of NUL terminated strings (C strings) from memory, calculates their lengths and then puts a pointer to the string and the length into each Node until the end of list is found.
If there are not enough free nodes to hold all of the found strings, the program silently exits with an error code of 1. If there is enough and there are no other errors, the program prints each of the found strings on a separate line and exits with a status code of 0.
The output is this:
My
The
The program, which I've place in a file named
```
; 64-bit linked list implementation in Linux NASM
global _start ; global entry point export for ld
MAX_NODES equ 300 ; this can easily be expanded
section .text
_start:
; first initialize the free node list
mov rcx, MAX_NODES-1
mov rdi, [freenode]
mov rsi, linkedlist
mov rax, Node_size
nodeclear:
add rsi, rax ; point to next node
mov [rdi + Node.next], rsi
add rdi, rax ; advance pointer
loop nodeclear
; set the last link to NULL
mov qword [rdi + Node.next], 0
; start out pointing to our rootnode pointer
mov rdi, rootnod
This program includes a test driver which reads a series of NUL terminated strings (C strings) from memory, calculates their lengths and then puts a pointer to the string and the length into each Node until the end of list is found.
If there are not enough free nodes to hold all of the found strings, the program silently exits with an error code of 1. If there is enough and there are no other errors, the program prints each of the found strings on a separate line and exits with a status code of 0.
The output is this:
Alfred
Barbara
Carlos
Delores
Edward
FrancesMy
Makefile looks like this:%.o : %.asm
nasm -f elf64 -l foo.lst -g -F stabs -o $@ %%CODEBLOCK_1%%lt;
% : %.o
ld -g -o $@ %%CODEBLOCK_1%%lt;The
Makefile is invoked as make linky.The program, which I've place in a file named
linky.asm is below:```
; 64-bit linked list implementation in Linux NASM
global _start ; global entry point export for ld
MAX_NODES equ 300 ; this can easily be expanded
section .text
_start:
; first initialize the free node list
mov rcx, MAX_NODES-1
mov rdi, [freenode]
mov rsi, linkedlist
mov rax, Node_size
nodeclear:
add rsi, rax ; point to next node
mov [rdi + Node.next], rsi
add rdi, rax ; advance pointer
loop nodeclear
; set the last link to NULL
mov qword [rdi + Node.next], 0
; start out pointing to our rootnode pointer
mov rdi, rootnod
Solution
Just a few notes, mostly in micro-optimizations that even assembly language programmers probably ignore nowadays.
To compare to 0, I prefer to use a
Likewise, in NodePrint you use an
Again, a
When you're getting ready for the scasb in FetchString, you load a 0 into
At least if memory serves, you'll get marginally smaller code with something like:
This zeros the entirety of rax instead of just AL, but you don't seem to need/use the rest of rax afterward. Some processors also have a partial register stall that makes essentially any operation on AL relatively slow (though I don't remember if this applies to any 64-bit processors, so it may not matter in this case).
Although it's minutely more fragile, I'd consider whether it's worth replacing this:
...with code like:
Taking advantage of the fact that immediately after executing a
I think it's worth noting, however, that all of these are really very minor. Overall, the code strikes me as quite clear and nicely written.
; steal a node from the freenode list
; [rootnode] = [freenode]
mov rbx, freenode ; rbx -> freenode
cmp qword [rbx], 0 ; Q: nodes available?
je errorExit ; N: nope, so leave
mov rax, [rbx] ;To compare to 0, I prefer to use a
test instruction:mov rbx, freenode
mov rax, [rbx]
test rax, rax
jz errorExit
; ...Likewise, in NodePrint you use an
or to determine whether a value is 0:or rdi, rdi
jz retnowAgain, a
test is sufficient here. The difference is that or will normally affect not only the flags (which you want) but the register (which you don't). In this case, you haven't actually changed the value in the register, but some CPUs will assume you did, which can prevent some instruction level parallelism.When you're getting ready for the scasb in FetchString, you load a 0 into
AL:mov rcx, -1 ; max count in ecx
mov al,0 ; look for NUL byte
cld ; count forward
repne scasb ; look for NULAt least if memory serves, you'll get marginally smaller code with something like:
xor rax, raxThis zeros the entirety of rax instead of just AL, but you don't seem to need/use the rest of rax afterward. Some processors also have a partial register stall that makes essentially any operation on AL relatively slow (though I don't remember if this applies to any 64-bit processors, so it may not matter in this case).
Although it's minutely more fragile, I'd consider whether it's worth replacing this:
loop nodeclear
; set the last link to NULL
mov qword [rdi + Node.next], 0...with code like:
loop nodeclear
mov qword [rdi+Node.next], rcxTaking advantage of the fact that immediately after executing a
loop, we know rcx contains 0.I think it's worth noting, however, that all of these are really very minor. Overall, the code strikes me as quite clear and nicely written.
Code Snippets
; steal a node from the freenode list
; [rootnode] = [freenode]
mov rbx, freenode ; rbx -> freenode
cmp qword [rbx], 0 ; Q: nodes available?
je errorExit ; N: nope, so leave
mov rax, [rbx] ;mov rbx, freenode
mov rax, [rbx]
test rax, rax
jz errorExit
; ...or rdi, rdi
jz retnowmov rcx, -1 ; max count in ecx
mov al,0 ; look for NUL byte
cld ; count forward
repne scasb ; look for NULxor rax, raxContext
StackExchange Code Review Q#45999, answer score: 7
Revisions (0)
No revisions yet.