patternModerate
Vigenère Cipher in Haskell
Viewed 0 times
vigenèrehaskellcipher
Problem
I wrote a Vigenere cipher in Haskell, but I feel like I way overdid it. This is the second version of the program that allows custom cipher tables, removing the limit of just having letters from A to Z, and nothing more.
I'd like to get some tips on what I could've done better, as this is the first program I've written in Haskell that is bigger than just a few functions.
``
where padTo xs size = take size $ cycle xs
-- gets a row from the table at a specific key k
-- if no row exists, Nothing is returned
getRow :: Char -> CipherEnv -> Maybe String
getRow k e = M.lookup k $ cipherTable e
-- gets the vth letter on the kth row on the table
-- due to the fact that both getRow and baseIndex are Maybe types,
-- Applicative mapping need
I'd like to get some tips on what I could've done better, as this is the first program I've written in Haskell that is bigger than just a few functions.
``
module Vigenere where
import qualified Data.Map as M
import qualified Data.List as L
import Data.Char
type Crypt a = a -> a -> CipherEnv -> Maybe a
type CharSet = [Char] -- alias for string, used to distinguish from text input
type CipherTable = M.Map Char CharSet
data CipherEnv = Cipher { cipherChars :: CharSet
, cipherTable :: CipherTable
} deriving (Show)
genRow :: Char -> CharSet -> Maybe CharSet
genRow char charset = let index = L.elemIndex char charset
stream = cycle charset
count = length charset
nthSet = drop index pure stream
in take count nthSet
mkTable :: CharSet -> CipherTable
mkTable charset = let generator = (genRow charset)
strings = mapM generator charset
in createMap $ (zip charset) strings
where createMap Nothing = M.empty
createMap (Just x) = M.fromList x
-- generates a cryptographic key based on a message and a keyword
-- if the message is "hello there", and the keyword is "apple";
-- then the resulting key will be "appleapplea"
genKey :: String -> String -> String
genKey msg kw = kw padTo` length msgwhere padTo xs size = take size $ cycle xs
-- gets a row from the table at a specific key k
-- if no row exists, Nothing is returned
getRow :: Char -> CipherEnv -> Maybe String
getRow k e = M.lookup k $ cipherTable e
-- gets the vth letter on the kth row on the table
-- due to the fact that both getRow and baseIndex are Maybe types,
-- Applicative mapping need
Solution
removing the limit of just having letters from A to Z, and nothing more.
You didn't. And that's due to
Result:
Here's the output of running "Hello, this is a test" through the function:
Just "KGaNdBSWJXUNKmBCNVTUn"
Here's the output of running Just "KGaNdBSWJXUNKmBCNVTUn" back through the function:
Just "HEnLq, THIS Iu A TESv"
That doesn't seem right. The wrong case of
Which will show you the second error in your suite (or at least wrong assumptions to the
*** Failed! Falsifiable (after 20 tests):
"\NAK\236j\NAKnL" [remark: see the duplicate \NAK?]
"\NAKL\NAKjj\236L\NAK\NAKL\236jL\236\NAK"
"\NAKL\NAK\NAK\236Ln\NAKLn\NAKjj"
The tests will pass if we use unique symbols in the character set:
And now, we finally get to code review.
No explicit export lists
This will export all your methods to the client. While this gives them much power, it also fills their namespace with all your helper methods, like
That will keep the noise down for common users, but give full power to developers.
Assumptions on CharSet do not hold
Let's have a look at your types:
You should definitely explain what
The word "alias" together with "distinguish" should ring some bells. While it's nice to see
A
Now, you can either specify that
Note that this also contains a lot potential for future optimizations, e.g. using a sorted
That being said, the real culprit is hidden in
Use assumptions to make functions easier
Your
We can simplify this function a lot if we use
This is a function you wouldn't likely export. Now,
It should now be obvious that the aforementioned assumptions about
Use Haddock documentation syntax
I suck at
You didn't. And that's due to
upCase and some quirks with CharSet. But before we get to that and the actual code review, let's have a look how you could have tested that:-- add to end of main
putStrLn $ "Here's the output of running " ++ show cipher ++ " back through the function: "
putStrLn $ show (cipher >>= \m -> environment $ decrypt m keyword)Result:
Here's the output of running "Hello, this is a test" through the function:
Just "KGaNdBSWJXUNKmBCNVTUn"
Here's the output of running Just "KGaNdBSWJXUNKmBCNVTUn" back through the function:
Just "HEnLq, THIS Iu A TESv"
That doesn't seem right. The wrong case of
E in "Hello" indicates that there's something going on with casing, and indeed, if you change upCase to id, your cipher will work. You could even add some QuickCheck tests:quickCheck $ property $
forAll (listOf1 arbitrary) $ \charset ->
forAll (listOf1 $ elements charset) $ \message ->
forAll (listOf1 $ elements charset) $ \keyword ->
let run f m = runFun charset $ f m keyword
encrypted = run encrypt message
decrypted = encrypted >>= run decrypt
in maybe "***ERROR***" id decrypted == messageWhich will show you the second error in your suite (or at least wrong assumptions to the
CharSet):*** Failed! Falsifiable (after 20 tests):
"\NAK\236j\NAKnL" [remark: see the duplicate \NAK?]
"\NAKL\NAKjj\236L\NAK\NAKL\236jL\236\NAK"
"\NAKL\NAK\NAK\236Ln\NAKLn\NAKjj"
The tests will pass if we use unique symbols in the character set:
forAll (L.nub `fmap` listOf1 arbitrary) $ \charset ->And now, we finally get to code review.
No explicit export lists
module Vigenere whereThis will export all your methods to the client. While this gives them much power, it also fills their namespace with all your helper methods, like
genKey, upCase and others. Instead, export only the functions they're going to need. If you want to give them access to internal methods, split your module into Vigenere and Vigenere.Internal. That will keep the noise down for common users, but give full power to developers.
Assumptions on CharSet do not hold
Let's have a look at your types:
type Crypt a = a -> a -> CipherEnv -> Maybe aYou should definitely explain what
a means here. Especially what the order of arguments is, e.g. "A Crypt takes a message, a key, and a cipher environment and produces an encrypted".type CharSet = [Char] -- alias for string, used to distinguish from text input
type CipherTable = M.Map Char CharSetThe word "alias" together with "distinguish" should ring some bells. While it's nice to see
CharSet in the type signature, it won't hinder a user to feed plain Strings into your cipher, which completely destroys an assumption you hold throughout your program:A
CharSet does not contain any symbol twice.Now, you can either specify that
CharSet must not contain duplicate symbols, or you use a newtype with a smart constructor:newtype CharSet = CharSet String
-- | Creates a CharSet from a String.
-- It basically removes all duplicate symbols.
mkCharSet :: String -> CharSet
mkCharSet = CharSet . L.nubNote that this also contains a lot potential for future optimizations, e.g. using a sorted
Data.Vector in order to have a O(log(N)) lookup instead of O(N). Note that you would not export the CharSet data constructor in this case, only mkCharSet (see previous section).That being said, the real culprit is hidden in
genRow. CharSet could be an alias to String without smart constructor, but your use in genRow doesn't allow that.Use assumptions to make functions easier
Your
genRow is a little bit odd. It looks for a character, drops all other till it finds it, and then adds them at the end. That's not the odd part. The odd part is it returns a Maybe, although all invariants in mkTable—the only place genRow is used—make sure that char will always be contained in charset.We can simplify this function a lot if we use
break instead:-- | Generates the shifted 'CharSet' corresponding
-- to the given 'char'. Returns the 'CharSet' if
-- 'char' is not contained.
genRow :: Char -> CharSet -> CharSet
genRow char charset =
let (a, b) = break (== char) charset
in b ++ aThis is a function you wouldn't likely export. Now,
mkTable can also get simplified a lot:-- | Generates a CipherTable from a 'CharSet'.
-- Note that decrypting will fail if the
-- 'CharSet' contains duplicate symbols.
mkTable :: CharSet -> CipherTable
mkTable charset = M.fromList $ map generator charset
where generator c = (c, genRow c charset)
-- ^^^^^^^^^^^^^^^^^^^^^It should now be obvious that the aforementioned assumptions about
genRow's arguments really hold.Use Haddock documentation syntax
I suck at
Code Snippets
-- add to end of main
putStrLn $ "Here's the output of running " ++ show cipher ++ " back through the function: "
putStrLn $ show (cipher >>= \m -> environment $ decrypt m keyword)quickCheck $ property $
forAll (listOf1 arbitrary) $ \charset ->
forAll (listOf1 $ elements charset) $ \message ->
forAll (listOf1 $ elements charset) $ \keyword ->
let run f m = runFun charset $ f m keyword
encrypted = run encrypt message
decrypted = encrypted >>= run decrypt
in maybe "***ERROR***" id decrypted == messageforAll (L.nub `fmap` listOf1 arbitrary) $ \charset ->module Vigenere wheretype Crypt a = a -> a -> CipherEnv -> Maybe aContext
StackExchange Code Review Q#115052, answer score: 10
Revisions (0)
No revisions yet.