patternMinor
x86 assembly method that mimics strcat
Viewed 0 times
strcatmethodmimicsx86assemblythat
Problem
I'm trying to learn x86 assembly and hopefully move on to projects of the scope that more than one person would be involved.
This is a (relatively) simple subroutine written in Intel x86 assembly using MASM that copies one string onto the end of another. I named it
I'm wondering if I can have some insight on the quality of:
This is a (relatively) simple subroutine written in Intel x86 assembly using MASM that copies one string onto the end of another. I named it
strglue to avoid conflict with the C library function strcat (since the driver for this method is a C program) and the MASM macro CatStr.I'm wondering if I can have some insight on the quality of:
- The code itself (optimality of the instructions and register usage)
- The documentation
- The indentation of the loops and labels
.386
.MODEL FLAT, C
.DATA
.CODE
; Glue together (concatenate) two strings
; PARAMS:
; [ESP+4] - Pointer to first character in null-terminated destination string
; [ESP+8] - Pointer to first character in null-terminated source string
; USES:
; EAX: Holds the destination string during execution
; ECX: Holds the source string during execution:
; EDX: Lower 8 bits (DL) holds characters being copied from source to destination
; OUTPUT:
; [ESP+4] - Pointer to first character in null-terminated concatenation of destination string + source string (excluding the null-terminator from the destination string)
strglue PROC
; Step 1: Get a pointer to the null-terminator character in the destination string
MOV EAX, DWORD PTR [ESP+4]
ENDOFDEST:
INC EAX
CMP BYTE PTR [EAX], 0
JNZ ENDOFDEST
; Step 2: Copy the source string into the pointer to the end of dest, overwriting the null-terminator
MOV ECX, DWORD PTR [ESP+8]
STRCATLOOP:
MOV DL, BYTE PTR [ECX]
LEA EAX, [EAX+1]
MOV BYTE PTR [EAX-1], DL
LEA ECX, [ECX+1]
; If ZF=0 the null terminator in the source string has been reached; stop looping
TEST DL, DL
JNZ STRCATLOOP
RET
strglue ENDP
ENDSolution
This strikes me as looking a little like code I'd expect to see from a compiler--code that works, but completely ignores both traditional x86 register usage, and the special instructions the x86 provides for the job.
Traditionally, you'd use
Note that this depends on the direction flag being 0, but every compiler of which I'm aware leaves it that way (and depend on your leaving it that way as well).
This is somewhat shorter. Under normal circumstances, we can expect either one to be memory bound, so speed probably isn't particularly relevant (but if we did care, it would depend on the exact processor in use--this would tend to work better on older processors, but would be about the same to possibly even a little slower on newer processors).
As far are readability goes, either one is fairly easy to read, but I think my code fits more closely with how somebody with experience writing code for an x86 would expect to see the registers used. As far as indentation goes, there doesn't seem to be much agreement about whether to indent assembly language. Your indentation certainly doesn't hurt anything, but it isn't like higher level languages where it's expected to the point that unindented code is clearly a problem. The way I've written the code above (instructions indented one stop, label(s) at the left margin) is almost certainly more common.
I should probably add that I'm a bit hesitant to post on this subject--while I think I wrote assembly language quite well at one time, it's been years since I wrote enough to notice. I do look forward to @Edward posting a review on this (though I'm not sure whether he writes much assembly code any more either).
Traditionally, you'd use
edi to point to your destination string, and esi to point to your source string. You'd use a repnz scasb to find the end of the first string, and lodsb/stosb to copy bytes from the source to the destination (there is actually a movsb to copy bytes as well, but it doesn't allow you to check for the end of the string as you copy).Note that this depends on the direction flag being 0, but every compiler of which I'm aware leaves it that way (and depend on your leaving it that way as well).
xor eax, eax ; set al to value we're looking for, 0 in this case.
lea ecx, [eax-1] ; set ecx to 0ffffffffh
mov edi, [esp+4] ; find end of dest string
repnz scasb
mov esi, [esp+8] ; we've found the end of the destination, now do the copy
dec edi
copy_loop:
lodsb
stosb
test al, al
jnz copy_loop
retThis is somewhat shorter. Under normal circumstances, we can expect either one to be memory bound, so speed probably isn't particularly relevant (but if we did care, it would depend on the exact processor in use--this would tend to work better on older processors, but would be about the same to possibly even a little slower on newer processors).
As far are readability goes, either one is fairly easy to read, but I think my code fits more closely with how somebody with experience writing code for an x86 would expect to see the registers used. As far as indentation goes, there doesn't seem to be much agreement about whether to indent assembly language. Your indentation certainly doesn't hurt anything, but it isn't like higher level languages where it's expected to the point that unindented code is clearly a problem. The way I've written the code above (instructions indented one stop, label(s) at the left margin) is almost certainly more common.
I should probably add that I'm a bit hesitant to post on this subject--while I think I wrote assembly language quite well at one time, it's been years since I wrote enough to notice. I do look forward to @Edward posting a review on this (though I'm not sure whether he writes much assembly code any more either).
Code Snippets
xor eax, eax ; set al to value we're looking for, 0 in this case.
lea ecx, [eax-1] ; set ecx to 0ffffffffh
mov edi, [esp+4] ; find end of dest string
repnz scasb
mov esi, [esp+8] ; we've found the end of the destination, now do the copy
dec edi
copy_loop:
lodsb
stosb
test al, al
jnz copy_loop
retContext
StackExchange Code Review Q#126495, answer score: 5
Revisions (0)
No revisions yet.