patternMinor

Assembly: Sum up the single bytes of a 32bit-register to a checksum

Submitted by: @import:stackexchange-codereview·Mar 10, 2026·

Viewed 0 times

the32bitchecksumassemblysingleregisterbytessum

Problem

Exercise description:

"Write a program that takes a double word (4 bytes) as an argument, and then adds all the 4 bytes. It returns the sum as output.
Note that all the bytes are considered to be of unsigned value.

Example: For the number 03ff0103 the program will calculate 0x03 + 0xff + 0x01 + 0x3 = 0x106, and the output will be 0x106

HINT: Use division to get to the values of the highest two bytes."

Full description on GitHub: xorpd

The code I've written:

format PE console
entry start

include 'win32a.inc' 

; ===============================================
section '.text' code readable executable

start:
    mov     eax,    0x01020304 

    xor     ebp,    ebp
process_eax:    
    movzx   ebx,    al  
    add     ecx,    ebx
    movzx   ebx,    ah
    add     ecx,    ebx
    cmp     ebp,    0x1
    je      print_result
    xor     edx,    edx
    mov     ebx,    0xffff
    div     ebx
    mov     ebp,    0x1
    jmp     process_eax 
print_result:
    mov     eax,    ecx
    call    print_eax   ; Provided by the teacher. Prints eax to the console.

exitProgram:    
    ; Exit the process:
    push    0
    call    [ExitProcess]

include 'training.inc'

I think it works. I've tried it with different values and the sums were correct.

Screenshot with the output of the code above (with 0x01020304 as the hardcoded value).

But it's surely not the most efficient way to solve the exercise.

Solution

Since you're still learning, I won't cheat you out of the opportunity to discover for yourself, but I will offer some words of advice on how you can improve your program.

Minimize register use

The current code uses eax, ebx, ecx, edx and ebp. One of the most important things for an assembly language programmer is to use registers efficiently and effectively. This particular task can easily be done with just two registers.

Prefer shift to division

As alluded to in a comment, shift instructions are typically much faster to execute than divide instructions. For that reason, in tasks like this, it's much more common to see a shift than a divide.

Avoid loops

Branching tends to be computationally disruptive for processors. While modern desktop machines tend to compensate for this via speculative execution and large cache sizes, code often runs faster if loops and branches are avoided entirely. This can confer other benefits such as more predictable running time which can be important for the scheduling of Real Time Operating Systems (RTOS) and in some kinds of cryptographic code to provide some resistance to side channel attacks.

Context

StackExchange Code Review Q#161008, answer score: 4

Revisions (0)

No revisions yet.