patternMinor
Algorithm to find "truncatable words"?
Viewed 0 times
algorithmwordstruncatablefind
Problem
My son who is just learning to read loves the concept of hiding one letter of a word with his finger and asking what the rest of the word spells. His favorite is turning "her" into "he" and "then" into "the."
As he's starting to learn bigger words I want to find some words for him that are "truncatable," similar to the concept of a truncatable prime. These would be dictionary words that are also dictionary words when a character is removed from either the left or the right side. A properly "truncatable" word would continue to be a word down to 1 character, but I would be really only looking for those that are truncatable down to three letters:
What is the best algorithm to do this? My initial naive approaches are proving quite horribly inefficient for the dictionary file that I am using. I am currently literally testing every word in the dictionary, recursively, and the running time and space are intractable and the methodology ends up testing each word many times.
Is there an efficient memoized or otherwise optimized way of finding truncatable words?
Edit: Thank you to https://cs.stackexchange.com/users/755/d-w who provided guidance, this is the code I ended up coming up with which works:
```
# find list of truncatable words in English
# truncatable words are those where a word
# can be dropped from the beginning or end
# to form a new valid word, all the way down
# to three letters or fewer!
def main():
# Find the truncatable words
# read dictionary into a list
f = open("english3.txt","r")
wordlist = []
for line in f:
wordlist.append(line.strip().lower())
# sort by length
wordlist2=sorted(wordlist,key=len)
max_word_length = len(wordlist2[-1])
# set up list of dictionaries for truncatable words
Truncatables = []
for filler in range (max_word_length+1):
Truncatables.append({})
# iterate through dictionary and test for truncatability
for word in
As he's starting to learn bigger words I want to find some words for him that are "truncatable," similar to the concept of a truncatable prime. These would be dictionary words that are also dictionary words when a character is removed from either the left or the right side. A properly "truncatable" word would continue to be a word down to 1 character, but I would be really only looking for those that are truncatable down to three letters:
Browser
Browse
Brows
Rows
RowWhat is the best algorithm to do this? My initial naive approaches are proving quite horribly inefficient for the dictionary file that I am using. I am currently literally testing every word in the dictionary, recursively, and the running time and space are intractable and the methodology ends up testing each word many times.
Is there an efficient memoized or otherwise optimized way of finding truncatable words?
Edit: Thank you to https://cs.stackexchange.com/users/755/d-w who provided guidance, this is the code I ended up coming up with which works:
```
# find list of truncatable words in English
# truncatable words are those where a word
# can be dropped from the beginning or end
# to form a new valid word, all the way down
# to three letters or fewer!
def main():
# Find the truncatable words
# read dictionary into a list
f = open("english3.txt","r")
wordlist = []
for line in f:
wordlist.append(line.strip().lower())
# sort by length
wordlist2=sorted(wordlist,key=len)
max_word_length = len(wordlist2[-1])
# set up list of dictionaries for truncatable words
Truncatables = []
for filler in range (max_word_length+1):
Truncatables.append({})
# iterate through dictionary and test for truncatability
for word in
Solution
Here is a simple implementation strategy.
Find all three-letter truncatable words. Every three-letter word is truncatable, so this is a linear scan through the dictionary to find the three-letter words. Store them in a hashtable.
Find all four-letter truncatable words. You can check whether a four-letter word wxyz is truncatable by checking whether wxy is a truncatable word; the latter can be checked by looking it up in the previous hashtable. So, this can be done with a linear scan through the dictionary. Store all four-letter truncatable words in a hashtable.
Find all five-letter truncatable words, by a linear scan plus a lookup in the hashtable for four-letter truncatable words. Note that given a word vwxyz, it suffices to check whether vwxy is truncatable (you don't need to check vwx).
Find all six-letter truncatable words.
And so on. Finish when you've exhausted the length of words in the dictionary.
Find all three-letter truncatable words. Every three-letter word is truncatable, so this is a linear scan through the dictionary to find the three-letter words. Store them in a hashtable.
Find all four-letter truncatable words. You can check whether a four-letter word wxyz is truncatable by checking whether wxy is a truncatable word; the latter can be checked by looking it up in the previous hashtable. So, this can be done with a linear scan through the dictionary. Store all four-letter truncatable words in a hashtable.
Find all five-letter truncatable words, by a linear scan plus a lookup in the hashtable for four-letter truncatable words. Note that given a word vwxyz, it suffices to check whether vwxy is truncatable (you don't need to check vwx).
Find all six-letter truncatable words.
And so on. Finish when you've exhausted the length of words in the dictionary.
Context
StackExchange Computer Science Q#117127, answer score: 2
Revisions (0)
No revisions yet.