patternMinor
Haskell Parsec parser of Verilog-style number literals
Viewed 0 times
literalsnumberparserverilogstylehaskellparsec
Problem
I've set myself the task to write a function that parses Verilog-style number literals. In Verilog, numbers are written like this:
The syntax is:
where `
import System.IO
import System.Environment
import Data.Either (either)
import Data.Either.Extra (fromRight)
import Data.Char (digitToInt, toLower)
import Numeric (readInt, readOct, readHex)
import Text.Parsec.Combinator
import Text.ParserCombinators.Parsec
import Text.Parsec.Char (hexDigit, octDigit)
integerLog :: Integer -> Integer
integerLog = floor . logBase 2.0 . fromIntegral
-- |numberSize calculates the length (in bits) of the binary representation of Integer
numberSize :: Integer -> Integer
numberSize = (1+) . integerLog
-- |The type Number has a val (Integer value) and a len (binary representation length).
data Number = Number {val::Integer, len:: Integer} deriving Show
-- |makeNumber is a safe constructor of an Either String Number
makeNumber :: Integer -> Integer -> (Either String Number)
makeNumber v l
| v l = Left $ "makeNumber: Value " ++ show v ++ " is "
++ show size ++ " bits long, should be at most " ++ show l ++ " bits"
| size Number
hexToInteger x = fst $ readHex x !! 0 :: Integer
octToInteger x = fst $ readOct x !! 0 :: Integer
binToInteger x = fst $ readBin x !! 0 :: Integer
-- |binDigit is not in Text.Parsec.Char (hexDigit, octDig
8'b10101100, 16'hffff, '789, 32'd0, 12'o7, etc.The syntax is:
'valuewhere `
denotes an optional field, and
size is the binary representation length of the number (in bits). The default is 32.
radix is 'b' or 'B' for binary, 'o' or 'O' for octal, 'd' or 'D' for decimal, 'h' or 'H' for hexadecimal. The default is 'd'.
value is the appropriate number representation for the given radix.
I'm new to Haskell, so I really would appreciate a critical appraisal of the following implementation. I think it is correct. Is it also 'good' Haskell/Parsec? Can it be done better, i.e. more concisely, more clearly, using more appropriate idioms?
``import System.IO
import System.Environment
import Data.Either (either)
import Data.Either.Extra (fromRight)
import Data.Char (digitToInt, toLower)
import Numeric (readInt, readOct, readHex)
import Text.Parsec.Combinator
import Text.ParserCombinators.Parsec
import Text.Parsec.Char (hexDigit, octDigit)
integerLog :: Integer -> Integer
integerLog = floor . logBase 2.0 . fromIntegral
-- |numberSize calculates the length (in bits) of the binary representation of Integer
numberSize :: Integer -> Integer
numberSize = (1+) . integerLog
-- |The type Number has a val (Integer value) and a len (binary representation length).
data Number = Number {val::Integer, len:: Integer} deriving Show
-- |makeNumber is a safe constructor of an Either String Number
makeNumber :: Integer -> Integer -> (Either String Number)
makeNumber v l
| v l = Left $ "makeNumber: Value " ++ show v ++ " is "
++ show size ++ " bits long, should be at most " ++ show l ++ " bits"
| size Number
hexToInteger x = fst $ readHex x !! 0 :: Integer
octToInteger x = fst $ readOct x !! 0 :: Integer
binToInteger x = fst $ readBin x !! 0 :: Integer
-- |binDigit is not in Text.Parsec.Char (hexDigit, octDig
Solution
Let's start with your imports. You should get rid of those you don't need, e.g.
Next, record field names are functions:
This gives you two functions,
Unless you're fine with those functions at global namespace, it's usually a better idea to either a) use a complete name, e.g.
What follows afterwards are several top-level functions without type signature, which is considered bad style:
All those functions are partial, e.g.
Your use prevents this kind of error, since you're using it
However, those functions would be a lot nicer with the appropriate type:
In this case we've dumped the plumbing into
While that's better than just a
That way you can separate the formatting from the number handling:
You could also provide some functions, e.g.
to ease the use of those, but that's an overkill here.
Speaking of overkills, did you notice the
Therefore, try to add some documentation at the type level, even if it isn't a new type:
Now, let's have a look at the parser:
This is well done, except for two minor quirks: you can collect all
and your use of
System.IO, import Data.Either.Extra (fromRight). Next, you shouldn't mix integral and floating point calculations, unless you're fine with imprecise results. For example, integerLog will give the wrong answer around 1024:ghci> integerLog (2^1023)
1023
ghci> integerLog (2^1024)
179769313486231590772930519078902473361797697894230657273430081157732675805500963132708477322407536021120113879871393357658789768814416622492847430639474124377767893424865485276302219601246094119453082952085005768838150682342462881473913110540827237163350510684586298239947245938479716304835356329624224137216Next, record field names are functions:
data Number = Number {val::Integer, len:: BitSize} deriving ShowThis gives you two functions,
val :: Number -> Integer
len :: Number -> IntegerUnless you're fine with those functions at global namespace, it's usually a better idea to either a) use a complete name, e.g.
value, bitLength, or b) prefix/suffix them with the type's name, e.g. numberValue or lenNumber. The story slightly changes if you want to use lenses and Template Haskell, but that's for another post.What follows afterwards are several top-level functions without type signature, which is considered bad style:
readBin = readInt 2 (`elem` "01") digitToInt
-- conversions String -> Number
hexToInteger x = fst $ readHex x !! 0 :: Integer
octToInteger x = fst $ readOct x !! 0 :: Integer
binToInteger x = fst $ readBin x !! 0 :: IntegerAll those functions are partial, e.g.
ghci> hexToInteger "Hello"
*** Exception: Prelude.!!: index too largeYour use prevents this kind of error, since you're using it
hexToInteger only on valid combinations, but you shouldn't export them either. That's not really an issue here since you've wrote a Main (main) module, but you should keep that in mind.However, those functions would be a lot nicer with the appropriate type:
convertWith :: ReadS Integer -> String -> Maybe Integer
convertWith fn xs = case fn xs of
[(n,"")] -> Just n
_ -> Nothing
hexToInteger, octToInteger, binToInteger :: String -> Maybe Integer
hexToInteger = convertWith readHex
octToInteger = convertWith readOct
binToInteger = convertWith (readInt 2 (`elem` "01") digitToInt)In this case we've dumped the plumbing into
convertWith, added type signatures and made the functions non-partial. Now, let's step back to your smart-constructor:-- |makeNumber is a safe constructor of an Either String Number
makeNumber :: Integer -> Integer -> (Either String Number)
makeNumber v l
| v l = Left $ "makeNumber: Value " ++ show v ++ " is "
++ show size ++ " bits long, should be at most " ++ show l ++ " bits"
| size <= 0 = Left $ "makeNumber: Size can't be negative or zero"
| otherwise = Right $ Number v l
where size = numberSize vWhile that's better than just a
Maybe Number, it's still not optimal if you have an application that's creating those numbers. Since sum types aren't expensive to write, let's use one:data NumberError = ENegative
| ENegativeSize
| EBitOverflow Integer Integer
deriving (Show, Eq)That way you can separate the formatting from the number handling:
makeNumber :: Integer -> BitSize -> (Either NumberError Number)
makeNumber v l
| v l = Left $ EBitOverflow v l
| size <= 0 = Left $ ENegativeSize
| otherwise = Right $ Number v l
where size = numberSize vYou could also provide some functions, e.g.
eNegative :: Either NumberError a
eBitOverflow :: Integer -> BitSize -> Either NumberError ato ease the use of those, but that's an overkill here.
Speaking of overkills, did you notice the
BitSize above? Many of your functions use Integer -> Integer -> …. But what do those Integer's mean? Which one is the value? Which one is the bitsize? And is Integer as bitsize really a good idea?Therefore, try to add some documentation at the type level, even if it isn't a new type:
type BitSize = IntNow, let's have a look at the parser:
-- |parse Verilog-style numbers 'value
parseNumber :: Parser Number
parseNumber = do
size oneOf "dD" oneOf "oO" oneOf "bB"
let (thisDigit, convert_digits) =
case toLower radix of
'h' -> (hexDigit, hexToInteger)
'o' -> (octDigit, octToInteger)
'd' -> (digit , read)
'b' -> (binDigit, binToInteger)
_ -> error "parseNumber: Internal error"
val <- many1 thisDigit
let size' = read size :: Integer
val' = convert_digits val :: Integer
num = makeNumber val' size'
in
either fail return numThis is well done, except for two minor quirks: you can collect all
oneOf parsers into a single one:oneOf "hHdDoObB"and your use of
error, which makes this parser partial again. InsteadCode Snippets
ghci> integerLog (2^1023)
1023
ghci> integerLog (2^1024)
179769313486231590772930519078902473361797697894230657273430081157732675805500963132708477322407536021120113879871393357658789768814416622492847430639474124377767893424865485276302219601246094119453082952085005768838150682342462881473913110540827237163350510684586298239947245938479716304835356329624224137216data Number = Number {val::Integer, len:: BitSize} deriving Showval :: Number -> Integer
len :: Number -> IntegerreadBin = readInt 2 (`elem` "01") digitToInt
-- conversions String -> Number
hexToInteger x = fst $ readHex x !! 0 :: Integer
octToInteger x = fst $ readOct x !! 0 :: Integer
binToInteger x = fst $ readBin x !! 0 :: Integerghci> hexToInteger "Hello"
*** Exception: Prelude.!!: index too largeContext
StackExchange Code Review Q#135568, answer score: 2
Revisions (0)
No revisions yet.