HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Sorted grep in Haskell

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
sortedgrephaskell

Problem

This program implements a sorted grep, that is, a specialized version of grep for sorted files. It uses binary search for the lines of a file that begin with a certain string.

You can copy and paste the code in a file and run it as:

%%CODEBLOCK_0%%gt; runhaskell sgrep.hs "string to find" sorted_file


I'm looking for suggestions about style, efficiency, and correctness.

module Main where

import Data.List (isPrefixOf)
import Data.Maybe (isNothing, fromJust)
import System.Environment (getArgs)
import System.IO

-- Chunk of a file
data Chunk = Chunk Handle Integer Integer

-- Is char newline?
isNL :: Char -> Bool
isNL c = c == '\n'

-- Are we at the beginning of file?
isBOF :: Handle -> IO Bool
isBOF = (fmap (== 0)) . hTell

-- Go to beginning of line
goToBOL :: Handle -> IO ()
goToBOL h = do
        bof  IO String
getCurrentLine h = goToBOL h >> hGetLine h

getPrevLine :: Handle -> IO (Maybe String)
getPrevLine h = do
        goToBOL h
        bof  Integer -> IO ()
goTo h i = do
        hSeek h AbsoluteSeek i

search :: Chunk -> String -> IO (Maybe String)
search (Chunk h start end) str
        | start >= end = return Nothing
        | otherwise = do
                if mid == (end - 1)
                   then return Nothing
                   else do
                           goTo h mid
                           midLine  String -> IO ()
sgrep h s = do
        len  sgrep h s)

Solution

Use your monads! Your code exhibits the walking-right antipattern. You can avoid it with when and guard. Consider goToBOL. This is how I would write it:

-- Go to beginning of line
goToBOL :: Handle -> IO ()
goToBOL h = do
        bof <- isBOF h
        when (not bof) $ do      
        eof <- hIsEOF h
        if eof then do hSeek h RelativeSeek (-2)
                       goToBOL h
               else do c <- hGetChar h
                       when (not $ isNL c) $ do
                       hSeek h RelativeSeek (-2)
                       goToBOL h


In your other functions, namely getPrevLine and search, you'd better use MaybeT IO x instead of IO (Maybe x) as you can use the monadic combinators better when you do so.

Code Snippets

-- Go to beginning of line
goToBOL :: Handle -> IO ()
goToBOL h = do
        bof <- isBOF h
        when (not bof) $ do      
        eof <- hIsEOF h
        if eof then do hSeek h RelativeSeek (-2)
                       goToBOL h
               else do c <- hGetChar h
                       when (not $ isNL c) $ do
                       hSeek h RelativeSeek (-2)
                       goToBOL h

Context

StackExchange Code Review Q#6318, answer score: 6

Revisions (0)

No revisions yet.