patternpythonModerate
Learn Python the Hard Way #48 — lexicon exercise
Viewed 0 times
thewayhardexercisepythonlearnlexicon
Problem
I'm trying to get some feedback on my code for the lexicon exercise (#48) in Learn Python the Hard Way. I have only recently started learning Python and want to make sure I have my fundamentals down before progressing. I have some experience coding (only with kdb+/q in my work environment) but none in Python, so all advice is welcomed.
Python version 2.6.
verb=['go','kill','eat']
direction=['north','south','east','west']
noun=['bear','princess']
stop=['the','in','of']
vocab={'verb':verb,'direction':direction,'noun':noun,'stop':stop}
def scan(sentence):
wordlist=sentence.split()
result=[] #initialize an empty list
for word in wordlist:
found=False
for key,value in vocab.items():
if word.lower() in value: #convert to lower case so that we can handle inputs with both cases
result.append((key,word))
found=True
break
if not found:
try:
word=int(word)
result.append(('number',word))
except ValueError:
result.append(('error',word))
return resultPython version 2.6.
Solution
verb=['go','kill','eat']
direction=['north','south','east','west']
noun=['bear','princess']
stop=['the','in','of']There is not a lot of point in storing these in separate variables only to stick them inside
vocab just put the list literals inside vocabvocab={'verb':verb,'direction':direction,'noun':noun,'stop':stop}Python convention say to make constant ALL_CAPS. I'd also not abbreviate the names. Code is read more then written, so make it easy to read not to write.
def scan(sentence):
wordlist=sentence.split()
result=[] #initialize an empty listPointless comment. Assume you reader understand the language. You don't need to explain what [] means.
for word in wordlist:I'd use
for word in sentence.split():found=FalseBoolean logic flags are delayed gotos. Avoid them when you can
for key,value in vocab.items():
if word.lower() in value: #convert to lower case so that we can handle inputs with both casesYour dictionary is backwards. It maps from word types to lists of words. It'd make more sense to have a dictionary mapping from word to the word type.
result.append((key,word))
found=True
break
if not found:Rather then this, use an
else block on the for loop. It'll be execute if and only if no break is executed.try:
word=int(word)
result.append(('number',word))Put this in the else block for the exception. Generally, try to have as little code in try blocks as possible. Also I wouldn't store the result in word again, I'd put in a new local.
except ValueError:
result.append(('error',word))
return resultMy approach:
WORD_TYPES = {
'verb' : ['go', 'kill', 'eat'],
'direction' : ['north', 'south', 'east', 'west'],
'noun' : ['bear', 'princess'],
'stop' : ['the','in','of']
}
# invert the dictionary
VOCABULARY = {word: word_type for word_type, words in WORD_TYPES.items() for word in words}
def scan(sentence):
tokens = []
for word in sentence.split():
try:
word_type = VOCABULAR[word]
except KeyError:
try:
value = int(word)
except ValueError:
tokens.append( ('error',word) )
else:
tokens.append( ('int', value) )
else:
tokens.append( (word_type, word) )
return tokensAlternately, using some regular expressions:
classifications = []
for word_type, words in WORD_TYPES.items():
word_expression = '|'.join("(?:%s)" % re.escape(word) for word in words)
expression = r"\b(?P%s)\b" % (word_type, word_expression)
classifications.append(expression)
classifications.append(r"\b(?P\d+)\b")
classifications.append(r"\b(?P\w+)\b")
parser = re.compile('|'.join(classifications))
def scan(sentence):
return [(match.lastgroup, match.group(0)) for match in parser.finditer(sentence)]EDIT
If your version of python is to old to use the dict comphrensions you can use:
dict((word, word_type) for word_type, words in WORD_TYPES.items() for word in words)Code Snippets
verb=['go','kill','eat']
direction=['north','south','east','west']
noun=['bear','princess']
stop=['the','in','of']vocab={'verb':verb,'direction':direction,'noun':noun,'stop':stop}def scan(sentence):
wordlist=sentence.split()
result=[] #initialize an empty listfor word in wordlist:found=FalseContext
StackExchange Code Review Q#14238, answer score: 10
Revisions (0)
No revisions yet.