HiveBrain v1.2.0
Get Started
← Back to all entries
patternphpMinor

PHP spell checker with suggestions for misspelled words

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
spellmisspelledwithphpwordsforsuggestionschecker

Problem

I built a simple PHP spellchecker and suggestions app that uses PHP's similar_text() and levenshtein() functions to compare words from a dictionary that is loaded into an array.

  • How it works is first I load the contents of the dictionary into an


array.

  • I split the user's input into words and spell check each of the


words.

  • I spell check by checking if the word is in the array that is the


dictionary.

  • If it is, then I echo a congratulations message and move on.



  • If not, I iterate through the dictionary-array comparing each word, in the dictionary-array, with the assumed misspelling.



  • If the inputted word, in lower-case and without punctuation, is 90%


or more similar to a word in the dictionary array, then I copy that
word from the dictionary array into an array of suggestions.

  • If no suggestions were found using the 90% or higher similarity


comparison, then I use levenshtein() to do a more liberal comparison
and add suggestions to the suggestions array.

  • Then I iterate through the suggestions array and echo each


suggestion.

I noticed that this is running slowly. Slow enough to notice. And I was wondering how I could improve the speed and efficiency of this spell checker.

Any and all changes, improvements, suggestions, and code are welcome and appreciated.

Here is the code (For syntax highlighted code, please visit here):

```
=90 && $percentageSimilarity0){
if(!in_array($suggestions)){
array_push($suggestions, $word);
}
}
}
}
echo "Looks like you spelled that wrong. Here are some suggestions: ";
foreach($suggestions as $suggestion){
echo "".$suggestion."";
}
}
}
if(isset($_GET['check'])){
$input = trim($_GET['check']);
$sentence='';
if(stripos($input, ' ')!==false){
$sentence = explode(' ', $input);
foreach($sentence as $item){
checkSpelling($item, $words);
}
}
else{
checkSpelling($input, $words);
}
}
?>


Solution

Here are a couple of tweaks that could help performance:

-
Rather than storing the dictionary in-memory, offload that to a database (potentially even caching commonly misspelled words as an optimization)

-
Ignore words under a certain length
(for example, MySQL's fulltext searching ignores words with fewer than 4 characters by default)

The thing that concerns me most with your algorithm is how much time it would take to compare every single word in the dictionary. This problem compounds with more words in the search query.

There has to be a way to quickly filter the dictionary to a smaller list of higher probability similarities (i.e. by word length, first letter, etc. ?)

Context

StackExchange Code Review Q#27173, answer score: 3

Revisions (0)

No revisions yet.