HiveBrain v1.2.0
Get Started
← Back to all entries
snippetMinor

Merge Sort: exercise

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
sortexercisemerge

Problem

I've decided to learn Haskell by following the current Coursera algorithms course using Haskell to implement the algorithms discussed.

First up is merge sort, which I've implemented as shown here. I'm brand new to Haskell so looking for any suggestions for improvements to the code - it seems quite verbose - and pointers on anything I may be doing wrong or in a weird way because I don't know what I'm doing.

module Main where

import System.Environment

main :: IO ()
main = do
  args  [a] -> [a]
sort xs = reduce True $ map (arr) xs

arr :: a -> [a]
arr x = [x]

reduce :: (Ord a) => Bool -> [[a]] -> [a]
reduce _ []            = []
reduce forwards (x:[]) = if forwards then x else reverse x
reduce forwards xs     = reduce (not forwards) (combine forwards [] xs)

combine :: (Ord a)   => Bool -> [[a]] -> [[a]] -> [[a]]
combine forwards acc []       = acc
combine forwards acc (x:[])   = (reverse x):acc
combine forwards acc (x:y:zs) = let merged = merge forwards x y
                                    acc'   = merged:acc
                                in  combine forwards acc' zs

merge :: (Ord a) => Bool -> [a] -> [a] -> [a]
merge forwards xs ys = mergeIter forwards [] xs ys

mergeIter :: (Ord a) => Bool -> [a] -> [a] -> [a] -> [a]
mergeIter _        acc [] []     = acc
mergeIter forwards acc (x:xs) [] = mergeIter forwards (x:acc) xs []
mergeIter forwards acc [] (y:ys) = mergeIter forwards (y:acc) [] ys
mergeIter forwards acc (x:xs) (y:ys)
    | x  y && not forwards   = mergeIter forwards (x:acc) xs (y:ys)
    | otherwise               = mergeIter forwards (y:acc) (x:xs) ys


One specific question I have is around recursive function with accumulator arguments - you'll see in the listing that I've had to create a mergeIter function just to allow a default for the initial argument for the accumulator.

As far as I can see Haskell doesn't support default values for arguments - should I be doing something else instead? A fold maybe?

Solution

OK, so let's see. arr x = [x] is usually called wrap, or just the operator section (:[]) is used. But map can also be written with list comprehension, giving us

sort :: (Ord a) => [a] -> [a]
sort xs = reduce True [[x] | x <- xs]


I find it much more visually apparent that way. We usually use short variable names in Haskell, with the same goal. When I see a variable fw and then not fw I get it that it is a toggle, a flag of some sorts, I don't need to see the long name like forwardorbackwardsflag explaining it to me in words while at the same time obscuring this visually. So we get

where
    reduce _ []    = []
    reduce fw [xs] = if fw then xs else reverse xs
    reduce fw s    = reduce (not fw) (combine fw [] s)


we make it an internal function to sort, because this is what it is. It has no otherwise uses on its own. It's too tailored for the specific use here. -- (x:[]) is written also [x], which is a lot simpler. -- We don't have to annotate types of everything. Here reduce is defined right under its use site, so we see that the initial value of fw is True. For me, it is self-documenting enough, but YMMV. Next, combine. There's no need to name all those interim entities. The name of the function itself isn't right, it's too generic. What it does, specifically, is process the elements of its input list in pairs:

rpairs fw acc []       = acc
    rpairs fw acc [x]      = reverse x : acc
    rpairs fw acc (x:y:zs) = rpairs fw (merge fw x y : acc) zs


It also builds its result in reversed order. We can make it an internal function to reduce, so that its argument fw could be referred to in rpairs body. At the same time there is no need in the separate accumulator really, we can just build the result naturally, in lazy fashion:

where
    reduce _ []    = []
    reduce fw [xs] = if fw then xs else reverse xs
    reduce fw s    = reduce (not fw) (reverse $ pairs s)
      where
        pairs []       = []
        pairs [x]      = [reverse x]
        pairs (x:y:zs) = merge fw x y : pairs zs


For merge to be able to refer to fw it too should be put here as an internal function to reduce. It is usual to split such functions into two like you did, the interface function ("the wrapper") and the internal "worker" function, except that normally the worker is made internal to the wrapper. But here the wrapper itself is too specialized, it reverses its result so is unlikely to be useable in general, so no need for the split. Thus we arrive at:

sort    :: (Ord a) => [a] -> [a]
sort xs = reduce True [[x] | x  y
                          = rmerge (x:acc) xs (y:ys)
          | otherwise     = rmerge (y:acc) (x:xs) ys


There's no shame in calling rmerge with the initial value as we do here, it's just an internal function here, called from another internal function.

But actually the reversals that rmerge performs are unnecessary, because of Haskell's lazy evaluation. merge now becomes a general function, usable in general. pairs too. reduce performs the same processing step on its input iteratively until a condition is met - there's a higher order function until that already captures this pattern. This leads to some more simplification of the code:

sort    :: (Ord a) => [a] -> [a]
sort [] = []
sort xs = head $ until (null . tail) (pairwise merge) [[x] | x <- xs]
                                  -- (reverse . pairwise merge) 
pairwise f (x:y:t) = f x y : pairwise f t
pairwise _ t       = t

merge xs []        = xs
merge [] ys        = ys
merge (x:xs) (y:ys)
       | x <= y    = x : merge  xs (y:ys)
       | otherwise = y : merge (x:xs) ys


I think we would need to toggle between <= and < in merge to ensure the sort's stability, if we were to proceed by (reverse . pairwise merge) steps. But, perhaps we can just throw that reverse away.

Code Snippets

sort :: (Ord a) => [a] -> [a]
sort xs = reduce True [[x] | x <- xs]
where
    reduce _ []    = []
    reduce fw [xs] = if fw then xs else reverse xs
    reduce fw s    = reduce (not fw) (combine fw [] s)
rpairs fw acc []       = acc
    rpairs fw acc [x]      = reverse x : acc
    rpairs fw acc (x:y:zs) = rpairs fw (merge fw x y : acc) zs
where
    reduce _ []    = []
    reduce fw [xs] = if fw then xs else reverse xs
    reduce fw s    = reduce (not fw) (reverse $ pairs s)
      where
        pairs []       = []
        pairs [x]      = [reverse x]
        pairs (x:y:zs) = merge fw x y : pairs zs
sort    :: (Ord a) => [a] -> [a]
sort xs = reduce True [[x] | x <- xs]
  where
    reduce _  []   = []
    reduce fw [xs] = if fw then xs else reverse xs
    reduce fw s    = reduce (not fw) . reverse . pairs $ s
      where
        pairs (x:y:t)     = rmerge [] x y : pairs t
        pairs [x]         = [reverse x]
        pairs []          = []

        rmerge acc [] []  = acc
        rmerge acc xs []  = reverse xs ++ acc
        rmerge acc [] ys  = reverse ys ++ acc
        rmerge acc (x:xs) (y:ys)
          | fw && x <= y || not fw && x > y
                          = rmerge (x:acc) xs (y:ys)
          | otherwise     = rmerge (y:acc) (x:xs) ys

Context

StackExchange Code Review Q#28045, answer score: 3

Revisions (0)

No revisions yet.