snippetMinor
Forth to the past - qsort
Viewed 0 times
pastqsorttheforth
Problem
To make a long story short, I must revive my Forth. It is rusty; last time I did anything serious in Forth was 30 years ago (man, I am old). As an exercise, I translated an STL-like implementation of
I am mostly interested in how readable the code is; how much did I sin against Forth spirit; if there are modern Forth coding conventions, how does this code fare.
`-1 cells constant -cell
: cell- -cell + ;
: xchg ( a0 a1 -- ) 2dup @ >r @ swap ! r> swap ! ;
: unguarded_linear_insert ( last val -- )
>r
begin cell- dup @ dup r@ > while over cell+ ! repeat
drop r> swap cell+ !
;
: unguarded_insertion_sort ( first last -- )
>r
begin dup r@ <> while dup dup @ unguarded_linear_insert cell+ repeat
drop rdrop
;
: linear_insert ( first last val -- )
>r over @ r@ swap while cell- dup @ over cell+ ! repeat
drop r> swap !
else
r> unguarded_linear_insert drop
then
;
: insertion_sort ( first last -- )
2dup <>
if
>r
dup begin cell+ dup r@ <> while 2dup dup @ linear_insert repeat
rdrop
then
2drop
;
: unguarded_partition ( l f p -- cut )
>r swap cell-
begin dup @ r@ > while cell- repeat swap
begin dup @ r@ while cell- repeat swap
begin dup @ r@ r 2dup > if swap then
r> 2dup > if swap then drop
2dup r over cell- @ over @ r>
median_of_3
;
: quicksort_loop ( l f t -- )
begin dup >r -rot 2dup - dup r@ > while
pivot
>r 2dup r> unguarded_partition
>r 2dup + r@ swap r> 2 * recurse swap else
dup -rot swap r> recurse then
rot
repeat
rdrop 2drop 2drop
;
: quicksort ( l f )
2dup 2 cells quicksort_loop
dup 2 cells + dup -rot insertion_sort
swap unguarded_insertion_sort
;
\ Testing and benchmarking
include random.fs
0 ran
qsort. It works (if you want to test it, install gforth). It performs well comparing to SDCC-compiled native C (I don't have Keil license). I know where my bottlenecks are.I am mostly interested in how readable the code is; how much did I sin against Forth spirit; if there are modern Forth coding conventions, how does this code fare.
`-1 cells constant -cell
: cell- -cell + ;
: xchg ( a0 a1 -- ) 2dup @ >r @ swap ! r> swap ! ;
: unguarded_linear_insert ( last val -- )
>r
begin cell- dup @ dup r@ > while over cell+ ! repeat
drop r> swap cell+ !
;
: unguarded_insertion_sort ( first last -- )
>r
begin dup r@ <> while dup dup @ unguarded_linear_insert cell+ repeat
drop rdrop
;
: linear_insert ( first last val -- )
>r over @ r@ swap while cell- dup @ over cell+ ! repeat
drop r> swap !
else
r> unguarded_linear_insert drop
then
;
: insertion_sort ( first last -- )
2dup <>
if
>r
dup begin cell+ dup r@ <> while 2dup dup @ linear_insert repeat
rdrop
then
2drop
;
: unguarded_partition ( l f p -- cut )
>r swap cell-
begin dup @ r@ > while cell- repeat swap
begin dup @ r@ while cell- repeat swap
begin dup @ r@ r 2dup > if swap then
r> 2dup > if swap then drop
2dup r over cell- @ over @ r>
median_of_3
;
: quicksort_loop ( l f t -- )
begin dup >r -rot 2dup - dup r@ > while
pivot
>r 2dup r> unguarded_partition
>r 2dup + r@ swap r> 2 * recurse swap else
dup -rot swap r> recurse then
rot
repeat
rdrop 2drop 2drop
;
: quicksort ( l f )
2dup 2 cells quicksort_loop
dup 2 cells + dup -rot insertion_sort
swap unguarded_insertion_sort
;
\ Testing and benchmarking
include random.fs
0 ran
Solution
I see some things that could be improved with this code.
Fix
The
There are a number of problems with that. First, the formatting is not very good. I'd prefer to see a stack comment, for example. Second, this should probably take a parameter so that any word can be timed (this appears to be the intent from
Use smaller words
Again using
Follow Forth convention
It's common to use prefixes to simplify code. For example, everywhere
Define and use common idioms
A number of cases in the code there is a sequence like this:
It would be used like this:
Eliminate unused variables
The
Simplify control structure
Instead of using
These are the refactored words:
Make sure comments don't lie
The
However, that's not correct. It should instead be:
Use refactoring to improve speed
The code currently contains this:
However, in addition to being somewhat opaque, it's not as fast as it could be. Refactoring into smaller chunks improves both readability and speed:
Be aware of non-standard extensions
The code uses the non-standard extensions:
It is not hard to write replacements for the first two if needed:
Since
Fix
exetimeThe
exetime word is defined like this:: exetime utime 2>r quicksort execute utime 2r> d- ." dtime " . cr ;There are a number of problems with that. First, the formatting is not very good. I'd prefer to see a stack comment, for example. Second, this should probably take a parameter so that any word can be timed (this appears to be the intent from
run). Third, that last . should be d. to display the entire delta time rather than just half of it.: exetime ( fp -- )
utime 2>r execute utime 2r> d- ." dtime " d. cr
;Use smaller words
Again using
exetime as an example, there are actually two things done by the word as defined. The first is that it calculates a delta time, and the second is that it prints that time. I'd make those separate words.Follow Forth convention
It's common to use prefixes to simplify code. For example, everywhere
ssrt is used, it's followed by @. For that reason, I'd define and use word like these:: @ssrt ssrt @ ;
: @size size @ ;
: !size size ! ;Define and use common idioms
A number of cases in the code there is a sequence like this:
size @ . but there is a common idiom for that and it's built in to many Forth implementations. If it's not, it's easy to define:: ? ( a -- ) @ . ;It would be used like this:
size ?. Alternatively, one could use the prefix idiom which is associated with a particular value:: ?size ( -- ) size @ . ;Eliminate unused variables
The
logs variable is defined but never used. This needlessly clutters the code.Simplify control structure
Instead of using
begin .. while .. repeat, it's often the case that one can use begin .. until instead and simplify the code. Using the refactored smaller words as suggested above, here's what run looks like now: : run ( fp )
>r
begin
copy-array
array-limits r@ exetime
." size " ?size ." dtime " d. cr
size2x total-size > until
rdrop
;These are the refactored words:
: exetime ( fp -- ) utime 2>r execute utime 2r> d- ;
: copy-array ( -- ) data @ ssrt @ size @ cells move ;
: array-limits ( -- alo ahi ) ssrt @ dup size @ cells + swap ;
: @size ( -- n ) size @ ;
: ?size ( -- ) @size . ;
: size2x ( -- n ) size @ 2 * dup size ! ;Make sure comments don't lie
The
shuffle word starts like this:: shuffle ( n a -- )However, that's not correct. It should instead be:
: shuffle ( n1 n2 a -- )Use refactoring to improve speed
The code currently contains this:
: median_of_3 ( n0 n1 n2 -- n )
>r 2dup > if swap then
r> 2dup > if swap then drop
2dup < if swap then drop
;However, in addition to being somewhat opaque, it's not as fast as it could be. Refactoring into smaller chunks improves both readability and speed:
\ arrange top two stack values to assure n0 if swap then ;
\ arrange top three stack values to assure n0 r lohi r> lohi ;
\ extract median value from top 3 items on stack
: median_of_3 ( n0 n1 n2 -- n1 ) 3sort drop nip ;Be aware of non-standard extensions
The code uses the non-standard extensions:
rdrop -rot utimeIt is not hard to write replacements for the first two if needed:
: rdrop r> drop ;
: -rot rot rot ;Since
utime is only used in the test code, perhaps it's not as critical. There is no standard replacement.Code Snippets
: exetime utime 2>r quicksort execute utime 2r> d- ." dtime " . cr ;: exetime ( fp -- )
utime 2>r execute utime 2r> d- ." dtime " d. cr
;: @ssrt ssrt @ ;
: @size size @ ;
: !size size ! ;: ? ( a -- ) @ . ;: ?size ( -- ) size @ . ;Context
StackExchange Code Review Q#106218, answer score: 2
Revisions (0)
No revisions yet.