HiveBrain v1.2.0
Get Started
← Back to all entries
patternModerate

Vigenère Cipher in Haskell

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
vigenèrehaskellcipher

Problem

I wrote a Vigenere cipher in Haskell, but I feel like I way overdid it. This is the second version of the program that allows custom cipher tables, removing the limit of just having letters from A to Z, and nothing more.

I'd like to get some tips on what I could've done better, as this is the first program I've written in Haskell that is bigger than just a few functions.

``
module Vigenere where

import qualified Data.Map as M
import qualified Data.List as L
import Data.Char

type Crypt a = a -> a -> CipherEnv -> Maybe a
type CharSet = [Char] -- alias for string, used to distinguish from text input
type CipherTable = M.Map Char CharSet

data CipherEnv = Cipher { cipherChars :: CharSet
, cipherTable :: CipherTable
} deriving (Show)

genRow :: Char -> CharSet -> Maybe CharSet
genRow char charset = let index = L.elemIndex char charset
stream = cycle charset
count = length charset
nthSet = drop index pure stream
in take count nthSet

mkTable :: CharSet -> CipherTable
mkTable charset = let generator = (
genRow charset)
strings = mapM generator charset
in createMap $ (zip charset) strings
where createMap Nothing = M.empty
createMap (Just x) = M.fromList x

-- generates a cryptographic key based on a message and a keyword
-- if the message is "hello there", and the keyword is "apple";
-- then the resulting key will be "appleapplea"
genKey :: String -> String -> String
genKey msg kw = kw
padTo` length msg
where padTo xs size = take size $ cycle xs

-- gets a row from the table at a specific key k
-- if no row exists, Nothing is returned
getRow :: Char -> CipherEnv -> Maybe String
getRow k e = M.lookup k $ cipherTable e

-- gets the vth letter on the kth row on the table
-- due to the fact that both getRow and baseIndex are Maybe types,
-- Applicative mapping need

Solution

removing the limit of just having letters from A to Z, and nothing more.

You didn't. And that's due to upCase and some quirks with CharSet. But before we get to that and the actual code review, let's have a look how you could have tested that:

-- add to end of main
  putStrLn $ "Here's the output of running " ++ show cipher ++ " back through the function: "
  putStrLn $ show (cipher >>= \m -> environment $ decrypt m keyword)


Result:

Here's the output of running "Hello, this is a test" through the function:
Just "KGaNdBSWJXUNKmBCNVTUn"
Here's the output of running Just "KGaNdBSWJXUNKmBCNVTUn" back through the function:
Just "HEnLq, THIS Iu A TESv"

That doesn't seem right. The wrong case of E in "Hello" indicates that there's something going on with casing, and indeed, if you change upCase to id, your cipher will work. You could even add some QuickCheck tests:

quickCheck $ property $
   forAll (listOf1 arbitrary)          $ \charset ->
   forAll (listOf1 $ elements charset) $ \message ->
   forAll (listOf1 $ elements charset) $ \keyword ->
     let run f m   = runFun charset $ f m keyword
         encrypted = run encrypt message
         decrypted = encrypted >>= run decrypt
     in maybe "***ERROR***" id decrypted == message


Which will show you the second error in your suite (or at least wrong assumptions to the CharSet):

*** Failed! Falsifiable (after 20 tests):
"\NAK\236j\NAKnL" [remark: see the duplicate \NAK?]
"\NAKL\NAKjj\236L\NAK\NAKL\236jL\236\NAK"
"\NAKL\NAK\NAK\236Ln\NAKLn\NAKjj"

The tests will pass if we use unique symbols in the character set:

forAll (L.nub `fmap` listOf1 arbitrary) $ \charset ->


And now, we finally get to code review.

No explicit export lists

module Vigenere where


This will export all your methods to the client. While this gives them much power, it also fills their namespace with all your helper methods, like genKey, upCase and others. Instead, export only the functions they're going to need. If you want to give them access to internal methods, split your module into Vigenere and Vigenere.Internal.

That will keep the noise down for common users, but give full power to developers.

Assumptions on CharSet do not hold

Let's have a look at your types:

type Crypt a = a -> a -> CipherEnv -> Maybe a


You should definitely explain what a means here. Especially what the order of arguments is, e.g. "A Crypt takes a message, a key, and a cipher environment and produces an encrypted".

type CharSet = [Char] -- alias for string, used to distinguish from text input
type CipherTable = M.Map Char CharSet


The word "alias" together with "distinguish" should ring some bells. While it's nice to see CharSet in the type signature, it won't hinder a user to feed plain Strings into your cipher, which completely destroys an assumption you hold throughout your program:


A CharSet does not contain any symbol twice.

Now, you can either specify that CharSet must not contain duplicate symbols, or you use a newtype with a smart constructor:

newtype CharSet = CharSet String 

-- | Creates a CharSet from a String.
--   It basically removes all duplicate symbols.
mkCharSet :: String -> CharSet
mkCharSet = CharSet . L.nub


Note that this also contains a lot potential for future optimizations, e.g. using a sorted Data.Vector in order to have a O(log(N)) lookup instead of O(N). Note that you would not export the CharSet data constructor in this case, only mkCharSet (see previous section).

That being said, the real culprit is hidden in genRow. CharSet could be an alias to String without smart constructor, but your use in genRow doesn't allow that.

Use assumptions to make functions easier

Your genRow is a little bit odd. It looks for a character, drops all other till it finds it, and then adds them at the end. That's not the odd part. The odd part is it returns a Maybe, although all invariants in mkTable—the only place genRow is used—make sure that char will always be contained in charset.

We can simplify this function a lot if we use break instead:

-- | Generates the shifted 'CharSet' corresponding
--   to the given 'char'. Returns the 'CharSet'  if
--   'char' is not contained.
genRow :: Char -> CharSet -> CharSet
genRow char charset = 
   let (a, b) = break (== char) charset
   in b ++ a


This is a function you wouldn't likely export. Now, mkTable can also get simplified a lot:

-- | Generates a CipherTable from a 'CharSet'.
--   Note that decrypting will fail if the
--   'CharSet' contains duplicate symbols.
mkTable :: CharSet -> CipherTable
mkTable charset = M.fromList $ map generator charset
  where generator c = (c, genRow c charset)
  --                  ^^^^^^^^^^^^^^^^^^^^^


It should now be obvious that the aforementioned assumptions about genRow's arguments really hold.

Use Haddock documentation syntax

I suck at

Code Snippets

-- add to end of main
  putStrLn $ "Here's the output of running " ++ show cipher ++ " back through the function: "
  putStrLn $ show (cipher >>= \m -> environment $ decrypt m keyword)
quickCheck $ property $
   forAll (listOf1 arbitrary)          $ \charset ->
   forAll (listOf1 $ elements charset) $ \message ->
   forAll (listOf1 $ elements charset) $ \keyword ->
     let run f m   = runFun charset $ f m keyword
         encrypted = run encrypt message
         decrypted = encrypted >>= run decrypt
     in maybe "***ERROR***" id decrypted == message
forAll (L.nub `fmap` listOf1 arbitrary) $ \charset ->
module Vigenere where
type Crypt a = a -> a -> CipherEnv -> Maybe a

Context

StackExchange Code Review Q#115052, answer score: 10

Revisions (0)

No revisions yet.