HiveBrain v1.2.0
Get Started
← Back to all entries
snippetMinor

Forth to the past - qsort

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
pastqsorttheforth

Problem

To make a long story short, I must revive my Forth. It is rusty; last time I did anything serious in Forth was 30 years ago (man, I am old). As an exercise, I translated an STL-like implementation of qsort. It works (if you want to test it, install gforth). It performs well comparing to SDCC-compiled native C (I don't have Keil license). I know where my bottlenecks are.

I am mostly interested in how readable the code is; how much did I sin against Forth spirit; if there are modern Forth coding conventions, how does this code fare.

`-1 cells constant -cell
: cell- -cell + ;

: xchg ( a0 a1 -- ) 2dup @ >r @ swap ! r> swap ! ;

: unguarded_linear_insert ( last val -- )
>r
begin cell- dup @ dup r@ > while over cell+ ! repeat
drop r> swap cell+ !
;

: unguarded_insertion_sort ( first last -- )
>r
begin dup r@ <> while dup dup @ unguarded_linear_insert cell+ repeat
drop rdrop
;

: linear_insert ( first last val -- )
>r over @ r@ swap while cell- dup @ over cell+ ! repeat
drop r> swap !
else
r> unguarded_linear_insert drop
then
;

: insertion_sort ( first last -- )
2dup <>
if
>r
dup begin cell+ dup r@ <> while 2dup dup @ linear_insert repeat
rdrop
then
2drop
;

: unguarded_partition ( l f p -- cut )
>r swap cell-
begin dup @ r@ > while cell- repeat swap
begin dup @ r@ while cell- repeat swap
begin dup @ r@ r 2dup > if swap then
r> 2dup > if swap then drop
2dup r over cell- @ over @ r>
median_of_3
;

: quicksort_loop ( l f t -- )
begin dup >r -rot 2dup - dup r@ > while
pivot
>r 2dup r> unguarded_partition
>r 2dup + r@ swap r> 2 * recurse swap else
dup -rot swap r> recurse then
rot
repeat
rdrop 2drop 2drop
;

: quicksort ( l f )
2dup 2 cells quicksort_loop
dup 2 cells + dup -rot insertion_sort
swap unguarded_insertion_sort
;

\ Testing and benchmarking

include random.fs
0 ran

Solution

I see some things that could be improved with this code.

Fix exetime

The exetime word is defined like this:

: exetime  utime 2>r quicksort execute  utime 2r> d- ." dtime " . cr ;


There are a number of problems with that. First, the formatting is not very good. I'd prefer to see a stack comment, for example. Second, this should probably take a parameter so that any word can be timed (this appears to be the intent from run). Third, that last . should be d. to display the entire delta time rather than just half of it.

: exetime  ( fp -- ) 
   utime 2>r  execute  utime 2r> d- ." dtime " d. cr 
;


Use smaller words

Again using exetime as an example, there are actually two things done by the word as defined. The first is that it calculates a delta time, and the second is that it prints that time. I'd make those separate words.

Follow Forth convention

It's common to use prefixes to simplify code. For example, everywhere ssrt is used, it's followed by @. For that reason, I'd define and use word like these:

: @ssrt ssrt @ ;
: @size size @ ;
: !size size ! ;


Define and use common idioms

A number of cases in the code there is a sequence like this: size @ . but there is a common idiom for that and it's built in to many Forth implementations. If it's not, it's easy to define:

: ?  ( a -- )   @ . ;


It would be used like this: size ?. Alternatively, one could use the prefix idiom which is associated with a particular value:

: ?size  ( -- )   size @ . ;


Eliminate unused variables

The logs variable is defined but never used. This needlessly clutters the code.

Simplify control structure

Instead of using begin .. while .. repeat, it's often the case that one can use begin .. until instead and simplify the code. Using the refactored smaller words as suggested above, here's what run looks like now:

: run ( fp )
    >r
    begin 
        copy-array
        array-limits r@ exetime
        ." size " ?size ." dtime " d. cr
    size2x total-size > until
    rdrop
;


These are the refactored words:

: exetime  ( fp -- ) utime 2>r execute utime 2r> d- ;
: copy-array ( -- ) data @ ssrt @ size @ cells move ;
: array-limits ( -- alo ahi ) ssrt @ dup size @ cells + swap ;
: @size ( -- n ) size @ ;
: ?size ( -- ) @size . ;
: size2x ( -- n ) size @ 2 * dup size ! ;


Make sure comments don't lie

The shuffle word starts like this:

: shuffle ( n a -- )


However, that's not correct. It should instead be:

: shuffle ( n1 n2 a -- )


Use refactoring to improve speed

The code currently contains this:

: median_of_3 ( n0 n1 n2 -- n )
    >r 2dup > if swap then
    r> 2dup > if swap then drop
       2dup < if swap then drop
;


However, in addition to being somewhat opaque, it's not as fast as it could be. Refactoring into smaller chunks improves both readability and speed:

\ arrange top two stack values to assure n0  if swap then ;
\ arrange top three stack values to assure n0 r lohi r> lohi ;
\ extract median value from top 3 items on stack
: median_of_3 ( n0 n1 n2 -- n1 ) 3sort drop nip ;


Be aware of non-standard extensions

The code uses the non-standard extensions:

rdrop -rot utime


It is not hard to write replacements for the first two if needed:

: rdrop r> drop ;
: -rot rot rot ;


Since utime is only used in the test code, perhaps it's not as critical. There is no standard replacement.

Code Snippets

: exetime  utime 2>r quicksort execute  utime 2r> d- ." dtime " . cr ;
: exetime  ( fp -- ) 
   utime 2>r  execute  utime 2r> d- ." dtime " d. cr 
;
: @ssrt ssrt @ ;
: @size size @ ;
: !size size ! ;
: ?  ( a -- )   @ . ;
: ?size  ( -- )   size @ . ;

Context

StackExchange Code Review Q#106218, answer score: 2

Revisions (0)

No revisions yet.