patternpythonMinor
Vigenere square cypher decryption
Viewed 0 times
cyphervigeneredecryptionsquare
Problem
I'm in the process of learning python and programmed this exercise in decrypting the Vigenere square cypher as practice.
Please comment on best practices, efficiency, or more pythonic ways to do things.
```
#!/usr/bin/env python
"""Functions for encrypting and decrypting text using
the Vigenere square cipher. See:
http://en.wikipedia.org/wiki/Vigen%C3%A8re_cipher
IC stands for Index of Coincidence:
http://en.wikipedia.org/wiki/Index_of_coincidence
"""
from __future__ import division
from collections import Counter
from math import fabs
from string import ascii_lowercase
from scipy.stats import pearsonr
from numpy import matrix
from os import system
#Define some constants:
LETTER_CNT = 26
ENGLISH_IC = 1.73
#Cornell English letter frequecy
#http://www.math.cornell.edu/~mec/2003-2004/cryptography/subs/frequencies.html
ENGLISH_LETTERS = 'etaoinsrhdlucmfywgpbvkxqjz'
ENGLISH_FREQ = [0.1202, 0.0910, 0.0812, 0.0768, 0.0731, 0.0695, 0.0628,
0.0602, 0.0592, 0.0432, 0.0398, 0.0288, 0.0271, 0.0261,
0.0230, 0.0211, 0.0209, 0.0203, 0.0182, 0.0149, 0.0111,
0.0069, 0.0017, 0.0011, 0.0010, 0.0007]
ENGLISH_DICT = dict(zip(list(ENGLISH_LETTERS), ENGLISH_FREQ))
MAX_LEN = 10 #Maximum keyword length
def scrub_string(str):
"""Remove non-alphabetic characters and convert string to lower case. """
return ''.join(ch for ch in str if ch.isalpha()).lower()
def string_to_numbers(str):
"""Convert str to a list of numbers giving the position of the letter
in the alphabet (position of a = 0). str should contain only
lowercase letters.
"""
return [ord(ch) - ord('a') for ch in str]
def numbers_to_string(nums):
"""Convert a list of numbers to a string of letters
(index of a = 0); the inverse function of string_to_numbers.
"""
return ''.join(chr(n + ord('a')) for n in nums)
def shift_string_by_number(str, shift):
"""Shift the letters in str by the amount shift (either positive
Please comment on best practices, efficiency, or more pythonic ways to do things.
```
#!/usr/bin/env python
"""Functions for encrypting and decrypting text using
the Vigenere square cipher. See:
http://en.wikipedia.org/wiki/Vigen%C3%A8re_cipher
IC stands for Index of Coincidence:
http://en.wikipedia.org/wiki/Index_of_coincidence
"""
from __future__ import division
from collections import Counter
from math import fabs
from string import ascii_lowercase
from scipy.stats import pearsonr
from numpy import matrix
from os import system
#Define some constants:
LETTER_CNT = 26
ENGLISH_IC = 1.73
#Cornell English letter frequecy
#http://www.math.cornell.edu/~mec/2003-2004/cryptography/subs/frequencies.html
ENGLISH_LETTERS = 'etaoinsrhdlucmfywgpbvkxqjz'
ENGLISH_FREQ = [0.1202, 0.0910, 0.0812, 0.0768, 0.0731, 0.0695, 0.0628,
0.0602, 0.0592, 0.0432, 0.0398, 0.0288, 0.0271, 0.0261,
0.0230, 0.0211, 0.0209, 0.0203, 0.0182, 0.0149, 0.0111,
0.0069, 0.0017, 0.0011, 0.0010, 0.0007]
ENGLISH_DICT = dict(zip(list(ENGLISH_LETTERS), ENGLISH_FREQ))
MAX_LEN = 10 #Maximum keyword length
def scrub_string(str):
"""Remove non-alphabetic characters and convert string to lower case. """
return ''.join(ch for ch in str if ch.isalpha()).lower()
def string_to_numbers(str):
"""Convert str to a list of numbers giving the position of the letter
in the alphabet (position of a = 0). str should contain only
lowercase letters.
"""
return [ord(ch) - ord('a') for ch in str]
def numbers_to_string(nums):
"""Convert a list of numbers to a string of letters
(index of a = 0); the inverse function of string_to_numbers.
"""
return ''.join(chr(n + ord('a')) for n in nums)
def shift_string_by_number(str, shift):
"""Shift the letters in str by the amount shift (either positive
Solution
Overall this is well documented, well written code. There are a number of things I may have written differently, but they are primarily manners of personal style. But I still found some things I want to call out that might be somewhat problematic, or at least worth examining:
-
-
-
As a general guideline, when you start to prioritize performance over readability, one of the lowest hanging fruits in python is function calls. While this can lead towards manually "inlining" other functions in order to save a few cycles, it can also lead towards other interesting approaches. For example, if the
But that strategy would fail utterly for short lengths of
scrub_stringmight not do what you want. Notably,"\xe9".isalpha()isTruewith my default settings, but your code probably does not handleLATIN SMALL LETTER E WITH ACUTE.
shift_string_by_letterseems like it might be better described as passing in the letter thatashould become (or becomes that becomesa). As a related comment, make sure you are aware thatassertlines are removed when python is run with-Oor-OOso you cannot depend on them to catch run-time errors. Your use here is correct, as passing anything other than1or-1is a programming error instead of a run-time error.
-
crypt uses modulus to match what itertools could make simpler. With cycle and izip it can become this:letters = (shift_string_by_letter(ch, p, which)
for ch, p in izip(text, cycle(passphrase)))-
IC doesn't appear to follow the same naming conventions as your other functions. It's short, and upper-case, possibly an abbreviation.- In
__main__, the use ofsystemand bash is unusual. I would replace this withraw_input("Press enter to print the decrypted text...")(orinput(...)in python 3) to avoid the indirection that needlessly prevents this from working on systems without bash.
- In several places, constructs like
ord(ch) - ord('a')orchr(n + ord('a'))help you convert between letters an indices. It might help either performance or readability to set up two dictionaries to do the conversion as a lookup, sayto_index[ch]orfrom_index[n]. This would have a nice effect of catching unsupported characters more explicitly by raising aKeyErrorif one is encountered.
-
As a general guideline, when you start to prioritize performance over readability, one of the lowest hanging fruits in python is function calls. While this can lead towards manually "inlining" other functions in order to save a few cycles, it can also lead towards other interesting approaches. For example, if the
text for crypt is long enough, it's plausible that making up to 26 dictionaries mapping all 26 characters would allow for faster running code overall, despite the set-up time. The resulting code might look like this:shift_map = { p : { ch : shift_string_by_letter(ch, p, which)
for ch in ascii_lowercase } for p in passphrase }
letters = (shift_map[p][ch] for ch, p in izip(text, cycle(passphrase)))But that strategy would fail utterly for short lengths of
text due to the costs of setting up the dictionary. It's also possible that all of this would be lost in the noise of using generator expressions. This pontification highlights the need to profile the scenarios for which you actually want your code to perform well, and then to be creative about how you fix them.Code Snippets
letters = (shift_string_by_letter(ch, p, which)
for ch, p in izip(text, cycle(passphrase)))shift_map = { p : { ch : shift_string_by_letter(ch, p, which)
for ch in ascii_lowercase } for p in passphrase }
letters = (shift_map[p][ch] for ch, p in izip(text, cycle(passphrase)))Context
StackExchange Code Review Q#38055, answer score: 3
Revisions (0)
No revisions yet.